【Lock】锁(Lock)/ Mutex

Posted by 西维蜀黍 on 2021-06-27, Last Modified on 2022-12-10

Lock / Mutex

In computer science, a lock or mutex (from mutual exclusion) is a synchronization primitive: a mechanism that enforces limits on access to a resource when there are many threads of execution. A lock is designed to enforce a mutual exclusion concurrency control policy, and with a variety of possible methods there exists multiple unique implementations for different applications.

互斥锁(Mutexes)

互斥对象(Mutexes)可以说是信号量(semaphore)在仅取值0/1时的特例,即binary semaphore,被用于实现共享资源的互斥访问,因此也被称为互斥锁。

一个互斥对象(mutex)是一个共享变量,包含未锁定(unlocked)的和已锁定的(locked)两个枚举状态。

由于互斥对象很简单,因此可以在基于 TSL 或 XCHG 指令下的用户态(user space)实现。

互斥锁和信号量的实现都依赖 TSL 指令保证“检查-占锁”动作的原子性。

把互斥量交给程序员使用太危险,有些编程语言实现了**“管程”**的特性,从编译器的层面保证了访问临界区的互斥,比如 Java 的 synchronized 关键字。

Futex

Futex 即 fast user space mutex,是 Linux 中实现基本锁的一个功能(feature),Futex 尽可能地避免切换到内核态,因为,一次从用户态到内核态的切换,最终再切换回用户态是非常消耗资源的。

信号量机制与互斥锁机制的区别

  • 互斥锁机制用于进程/线程的互斥(mutual exclusion),信号量机制用于进程/线程的同步(synchronization)。
    • 这是互斥量机制和信号量机制的根本区别,也就是互斥和同步之间的区别。
    • 互斥:是指某一资源同时只允许一个访问者对其进行访问,具有唯一性和排它性。但互斥无法限制访问者对资源的访问顺序,即访问是无序的
    • 同步:是指在互斥的基础上(大多数情况),通过其它机制实现访问者对资源的有序访问。在大多数情况下,同步已经实现了互斥,特别是所有写入资源的情况必定是互斥的。少数情况是指可以允许多个访问者同时访问资源
  • 互斥锁机制中的互斥对象取值只能为0或1,而信号量机制中的信号量值为非负整数。
  • 互斥锁机制中的加锁和解锁必须由同一进程/线程进行操作,而信号量机制中的信号量可以由一个进程/线程释放,另一个进程/线程得到。

总结来说,互斥锁机制是信号量机制的特例。互斥锁机制只能用于对一个资源的互斥访问控制,它不能实现多个资源的多进程互斥问题。信号量机制可以实现多个同类资源的多进程互斥和同步。当信号量为二元信号量(Binary semaphores)时,则等同于互斥锁,可以完成一个资源的互斥访问。

Granularity

Before being introduced to lock granularity, one needs to understand three concepts about locks:

  • lock overhead: the extra resources for using locks, like the memory space allocated for locks, the CPU time to initialize and destroy locks, and the time for acquiring or releasing locks. The more locks a program uses, the more overhead associated with the usage;
  • lock contention: this occurs whenever one process or thread attempts to acquire a lock held by another process or thread. The more fine-grained the available locks, the less likely one process/thread will request a lock held by the other. (For example, locking a row rather than the entire table, or locking a cell rather than the entire row.);
  • deadlock: the situation when each of at least two tasks is waiting for a lock that the other task holds. Unless something is done, the two tasks will wait forever.

There is a tradeoff between decreasing lock overhead and decreasing lock contention when choosing the number of locks in synchronization.

An important property of a lock is its granularity. The granularity is a measure of the amount of data the lock is protecting. In general, choosing a coarse granularity (a small number of locks, each protecting a large segment of data) results in less lock overhead when a single process is accessing the protected data, but worse performance when multiple processes are running concurrently. This is because of increased lock contention. The more coarse the lock, the higher the likelihood that the lock will stop an unrelated process from proceeding. Conversely, using a fine granularity (a larger number of locks, each protecting a fairly small amount of data) increases the overhead of the locks themselves but reduces lock contention. Granular locking where each process must hold multiple locks from a common set of locks can create subtle lock dependencies. This subtlety can increase the chance that a programmer will unknowingly introduce a deadlock

Disadvantages

Lock-based resource protection and thread/process synchronization have many disadvantages:

  • Contention: some threads/processes have to wait until a lock (or a whole set of locks) is released. If one of the threads holding a lock dies, stalls, blocks, or enters an infinite loop, other threads waiting for the lock may wait forever.
  • Overhead: the use of locks adds overhead for each access to a resource, even when the chances for collision are very rare. (However, any chance for such collisions is a race condition.)
  • Debugging: bugs associated with locks are time dependent and can be very subtle and extremely hard to replicate, such as deadlocks.
  • Instability: the optimal balance between lock overhead and lock contention can be unique to the problem domain (application) and sensitive to design, implementation, and even low-level system architectural changes. These balances may change over the life cycle of an application and may entail tremendous changes to update (re-balance).
  • Composability: locks are only composable (e.g., managing multiple concurrent locks in order to atomically delete item X from table A and insert X into table B) with relatively elaborate (overhead) software support and perfect adherence by applications programming to rigorous conventions.
  • Priority inversion: a low-priority thread/process holding a common lock can prevent high-priority threads/processes from proceeding. Priority inheritance can be used to reduce priority-inversion duration. The priority ceiling protocol can be used on uniprocessor systems to minimize the worst-case priority-inversion duration, as well as prevent deadlock.
  • Convoying: all other threads have to wait if a thread holding a lock is descheduled due to a time-slice interrupt or page fault.

Some concurrency control strategies avoid some or all of these problems. For example, a funnel or serializing tokens can avoid the biggest problem: deadlocks. Alternatives to locking include non-blocking synchronization methods, like lock-free programming techniques and transactional memory. However, such alternative methods often require that the actual lock mechanisms be implemented at a more fundamental level of the operating software. Therefore, they may only relieve the application level from the details of implementing locks, with the problems listed above still needing to be dealt with beneath the application.

In most cases, proper locking depends on the CPU providing a method of atomic instruction stream synchronization (for example, the addition or deletion of an item into a pipeline requires that all contemporaneous operations needing to add or delete other items in the pipe be suspended during the manipulation of the memory content required to add or delete the specific item). Therefore, an application can often be more robust when it recognizes the burdens it places upon an operating system and is capable of graciously recognizing the reporting of impossible demands.

Reference