Introduction to Concurrency
This interactive guide explores Go's powerful and simple approach to concurrent programming. Go was designed with concurrency as a first-class citizen, making it easier to build responsive, high-performance applications that can handle many tasks at once.
Concurrency vs. Parallelism
It's crucial to understand the difference:
- Concurrency is about dealing with many things at once. It's a way of structuring your program to handle multiple independent tasks, which may or may not run at the same time.
- Parallelism is about doing many things at once. It's the simultaneous execution of tasks, requiring multiple CPU cores.
In Go, you design for concurrency, and the Go runtime provides parallelism if the hardware supports it.
Go's Philosophy: "Share Memory by Communicating"
Go encourages a different approach to concurrency. Instead of different tasks sharing the same piece of memory and using locks to prevent conflicts (a common source of bugs), Go prefers passing data between tasks using channels.
This means that at any given time, only one task (goroutine) has ownership of the data, which inherently prevents race conditions and simplifies concurrent code.
Goroutines
Goroutines are the fundamental building blocks of concurrency in Go. A goroutine is a lightweight thread of execution managed by the Go runtime.
Launching a Goroutine
Starting a goroutine is incredibly simple. You just use the `go` keyword before a function call. This tells Go to run the function concurrently, without blocking the current flow.
go myFunction() // Run myFunction in a new goroutine
Goroutines vs. OS Threads
Goroutines are not the same as the operating system threads you might be familiar with from other languages. They are much more efficient, allowing you to run thousands or even millions of them at once.
Feature | Goroutines | OS Threads |
---|---|---|
Memory | Lightweight (starts at 2KB) | Heavyweight (1MB or more) |
Management | Go Runtime (user-space) | Operating System (kernel) |
Startup Time | Very Fast | Slower |
Communication | Channels (idiomatic) | Shared Memory & Locks |
The Main Goroutine
Every Go program starts with a single goroutine: the `main` goroutine. If the `main` goroutine finishes, the program exits immediately, even if other goroutines are still running. You must explicitly wait for other goroutines to complete if their work is important.
Channels
Channels are the pipes that connect concurrent goroutines. You can send values into channels from one goroutine and receive those values into another goroutine, providing a safe and easy way to communicate.
Creating and Using Channels
// Create a channel for integers
ch := make(chan int)
// Send a value to the channel
ch <- 42
// Receive a value from the channel
value := <-ch
Unbuffered Channels (Synchronous)
By default, channels are unbuffered. This means they will only accept a send if there is a corresponding receive ready to take the value. Both send and receive operations block until the other side is ready. This makes them great for synchronization.
Unbuffered Channel Demo
Buffered Channels (Asynchronous)
You can also create buffered channels with a fixed capacity. Sends to a buffered channel only block if the buffer is full, and receives only block if the buffer is empty. This decouples the sender and receiver.
Buffered Channel Demo
The `select` Statement
The `select` statement lets a goroutine wait on multiple channel operations. It's like a `switch` statement, but for channels. A `select` blocks until one of its cases can run, then it executes that case. If multiple are ready, it chooses one at random.
Basic `select`
select {
case msg1 := <-ch1:
fmt.Println("received", msg1)
case msg2 := <-ch2:
fmt.Println("received", msg2)
}
Timeouts with `select`
A very common use of `select` is to implement timeouts. By combining a case for your operation with a case for `time.After`, you can prevent a goroutine from waiting forever.
select {
case res := <-operationChannel:
fmt.Println(res)
case <-time.After(1 * time.Second):
fmt.Println("operation timed out!")
}
Non-Blocking Operations with `default`
Adding a `default` case to a `select` makes it non-blocking. If no other case is ready, the `default` case will run immediately.
select {
case msg := <-messages:
fmt.Println("received message", msg)
default:
fmt.Println("no message received")
}
Synchronization Primitives
While channels are the preferred way to communicate, sometimes you need more traditional synchronization tools. The `sync` package provides these for situations involving shared memory.
`sync.WaitGroup`
A `WaitGroup` waits for a collection of goroutines to finish. You `Add()` a count of goroutines to wait for, each goroutine calls `Done()` when it's finished, and the main goroutine calls `Wait()` to block until all are done.
`sync.Mutex`
A `Mutex` (mutual exclusion lock) is used to protect shared data from being accessed by multiple goroutines at the same time. This prevents race conditions.
Race Condition Demo
1000 goroutines will increment a counter. See what happens with and without a mutex.
Expected Result: 1000
Actual Result: -
`sync.RWMutex`
A `RWMutex` (Read-Write Mutex) is a more specialized lock that allows any number of readers to access the data simultaneously, but only one writer at a time. It's useful when you have many more reads than writes.
`sync.Once`
A `Once` is an object that will perform an action exactly once, no matter how many times it's called from different goroutines. It's perfect for one-time initialization tasks.
Context Package
The `context` package is essential for managing the lifecycle of requests and goroutines. It allows you to pass deadlines, cancellation signals, and other request-scoped values across API boundaries and between goroutines.
Key Concepts
- Cancellation: A parent operation can cancel all the goroutines it has started.
- Timeouts/Deadlines: You can set a time limit for an operation. If it takes too long, it's automatically canceled.
- Request-scoped Values: Carry data like user IDs or trace IDs through a call stack without passing them as arguments to every function.
How it Works
You create a `context` and pass it to functions. These functions can then listen for a cancellation signal on the context's `Done()` channel using a `select` statement. When the context is canceled (either manually or by a timeout), the `Done()` channel is closed, and the goroutines can clean up and exit gracefully.
func worker(ctx context.Context) {
select {
case <-time.After(2 * time.Second):
fmt.Println("work done")
case <-ctx.Done(): // Check if context was canceled
fmt.Println("work canceled:", ctx.Err())
}
}
func main() {
// Create a context that times out after 1 second
ctx, cancel := context.WithTimeout(context.Background(), 1*time.Second)
defer cancel() // Important to call cancel to release resources
worker(ctx) // This will print "work canceled: context deadline exceeded"
}
Passing a `context` as the first argument to functions is a standard convention in modern Go code, especially for network services and long-running tasks.
Common Concurrency Patterns
Go's primitives enable several powerful patterns for structuring concurrent work.
Worker Pools
A worker pool is a pattern for limiting the number of goroutines running at once. Instead of launching a new goroutine for every task, you have a fixed number of "worker" goroutines that pull tasks from a jobs channel. This prevents resource exhaustion.
Worker Pool Performance
Compare the time to process 10 jobs with a variable number of workers.
Fan-Out, Fan-In
This pattern is used to parallelize work. Fan-Out: A producer goroutine distributes jobs to multiple worker goroutines. Fan-In: A consumer goroutine collects the results from all the workers into a single channel.
Pipelines
A pipeline is a series of stages connected by channels, where each stage is a goroutine. The output of one stage becomes the input for the next. This is great for stream processing, as each stage can work concurrently on different pieces of data.
Common Concurrency Issues
While Go makes concurrency easier, it doesn't eliminate common pitfalls. Understanding them is key to writing robust code.
Race Conditions
This happens when multiple goroutines access shared data, and at least one is writing, without synchronization. The result is unpredictable. Solution: Use channels to pass ownership of data or use `sync.Mutex` to protect access.
You can detect races using the `-race` flag: `go run -race main.go`.
Deadlocks
A deadlock occurs when goroutines are all waiting for each other, and none can proceed. A common cause is a goroutine trying to send to an unbuffered channel with no receiver ready. The Go runtime can often detect and report deadlocks, causing a panic.
Livelocks
A livelock is when goroutines are actively working but making no progress. They are busy responding to each other's state changes but not actually completing their tasks. This is rarer than a deadlock but can be harder to debug.
Go Memory Model
The Go Memory Model specifies the conditions under which a read of a variable in one goroutine is guaranteed to observe the value written by another goroutine. This is defined by the "happens-before" relationship.
"Happens-Before"
If event A happens-before event B, then the effects of A are guaranteed to be visible to B. If there is no happens-before relationship, their order is not guaranteed, and you might have a race condition.
Synchronization Establishes "Happens-Before"
You must use synchronization primitives to establish a happens-before relationship between goroutines:
- A send on a channel happens-before the corresponding receive completes.
- The closing of a channel happens-before a receive that returns a zero value because the channel is closed.
- An unlock of a `sync.Mutex` happens-before a subsequent lock on the same mutex.
The core message is: if you are accessing shared data from multiple goroutines, you must use a synchronization mechanism.