展示 HN:Deadlog – 几乎可以直接使用的 Go 死锁调试互斥锁
Show HN: Deadlog – almost drop-in mutex for debugging Go deadlocks

原始链接: https://github.com/stevenctl/deadlog

## Deadlog:Go 中互斥锁死锁的调试 Deadlog 是一个 Go 库,旨在帮助识别和诊断互斥锁死锁。它通过包装 `sync.Mutex` 和 `sync.RWMutex`,并提供日志记录和分析工具来实现。只需将你的互斥锁声明替换为 `deadlog.New()` 即可。 该库提供 `LockFunc()` 和 `RLockFunc()`,与标准的 `Lock()`/`RLock()` 不同,它们会记录 `RELEASED` 事件,从而能够检测未释放的锁。你可以使用 `WithName()` 进一步增强日志记录,以标记锁,并使用 `WithTrace()` 包含堆栈跟踪,以精确定位获取锁的位置。 Deadlog 将 JSON 事件输出到 stdout(可以通过 logger 或 writer 进行自定义)。可以使用 `deadlog analyze` CLI 工具,或通过 `github.com/stevenctl/deadlog/analyze` 包以编程方式分析这些日志。分析器会识别“卡住”的 goroutine(等待获取锁)和“持有”的锁(已获取但未释放),从而提供清晰的竞争报告。 Deadlog 支持跟踪(LOCK/RLOCK 与 RELEASED 事件)和未跟踪(WLOCK/RWLOCK)互斥锁类型,以实现灵活的调试。它是一个强大的工具,可以主动识别和解决并发 Go 应用程序中的死锁问题。

黑客新闻 新 | 过去 | 评论 | 提问 | 展示 | 招聘 | 提交 登录 展示 HN: Deadlog – 几乎可以直接使用的 Go 死锁调试互斥锁 (github.com/stevenctl) 8 分,由 dirteater_ 1 小时前发布 | 隐藏 | 过去 | 收藏 | 讨论 我已经用这种 println 调试方法很多次了,还配合一些 sed/awk 工具来找出导致问题的调用。现在它变成了一个小型的 Go 包。 通过一些 `runtime.Callers`,我通常可以通过简单地将现有的 Mutex 或 RWMutex 替换成这个来找到问题所在。 有时候我会将 mu.Lock() defer mu.Unlock() 替换成 LockFunc/RLockFunc 来获取更多细节 defer mu.LockFunc()() 我几乎总是用 `deadlog.New(deadlog.WithTrace(1))` 初始化它,这就足够了。 这不是一个最完善的库,但它不应该进入任何提交,只是一个临时的调试辅助工具。 我觉得它很有用。 指南 | 常见问题 | 列表 | API | 安全 | 法律 | 申请 YC | 联系方式 搜索:
相关文章

原文

Build Badge Go Version License Go Report Card

A Go library for debugging mutex deadlocks with logged wrappers and analysis tools.

go get github.com/stevenctl/deadlog

Replace sync.Mutex or sync.RWMutex with deadlog.Mutex:

import "github.com/stevenctl/deadlog"

// Before
var mu sync.RWMutex

// After
var mu = deadlog.New(deadlog.WithName("my-service"))

The API is compatible with both sync.Mutex and sync.RWMutex:

// Write lock (sync.Mutex compatible)
mu.Lock()
defer mu.Unlock()

// Read lock (sync.RWMutex compatible)
mu.RLock()
defer mu.RUnlock()

Tracking unreleased locks

Use LockFunc() or RLockFunc() to get correlated RELEASED events:

unlock := mu.LockFunc()
defer unlock()

This logs START, ACQUIRED, and RELEASED events with the same correlation ID, making it easy to identify which lock was never released.

Use WithLockName() to label individual lock operations on the same mutex:

mu := deadlog.New(deadlog.WithName("player-state"), deadlog.WithTrace(1))

// Each callsite gets its own name in the logs
unlock := mu.LockFunc(deadlog.WithLockName("update-health"))
defer unlock()

Combined with WithTrace(1), the JSON events pinpoint exactly what's happening:

{"type":"LOCK","state":"START","name":"update-health","id":4480578,"trace":"updateHealth:25","ts":1770746273707970140}
{"type":"LOCK","state":"ACQUIRED","name":"update-health","id":4480578,"trace":"updateHealth:25","ts":1770746273707993939}
{"type":"LOCK","state":"START","name":"add-item","id":9375956,"trace":"addItem:29","ts":1770746273707996887}
{"type":"LOCK","state":"ACQUIRED","name":"add-item","id":9375956,"trace":"addItem:29","ts":1770746273707998734}
{"type":"LOCK","state":"START","name":"apply-damage","id":6439038,"trace":"applyDamage:33","ts":1770746273708002604}

The analyzer turns this into a clear report — apply-damage is stuck waiting, while update-health and add-item are holding their locks:

===============================================
  LOCK CONTENTION ANALYSIS
===============================================

=== STUCK: Started but never acquired (waiting for lock) ===
  LOCK  | apply-damage         | ID: 6439038
         Trace: applyDamage:33

=== HELD: Acquired but never released (holding lock) ===
  LOCK  | update-health        | ID: 4480578
         Trace: updateHealth:25
  LOCK  | add-item             | ID: 9375956
         Trace: addItem:29

=== SUMMARY ===
  Stuck waiting: 1
  Held:          2

Enable stack traces to see where locks are being acquired:

mu := deadlog.New(
    deadlog.WithName("my-mutex"),
    deadlog.WithTrace(5), // 5 frames deep
)

By default, events are written as JSON to stdout. Use a custom logger:

mu := deadlog.New(
    deadlog.WithLogger(func(e deadlog.Event) {
        log.Printf("[DEADLOG] %s %s %s id=%d", e.Type, e.State, e.Name, e.ID)
    }),
)

Or write to a specific writer:

f, _ := os.Create("locks.jsonl")
mu := deadlog.New(deadlog.WithLogger(deadlog.WriterLogger(f)))

Install the CLI:

go install github.com/stevenctl/deadlog/cmd/deadlog@latest

Analyze a log file:

Or pipe from your application:

go run ./myapp 2>&1 | deadlog analyze -

See Named callsites above for example output.

Use the analysis library programmatically:

import "github.com/stevenctl/deadlog/analyze"

result, err := analyze.AnalyzeFile("app.log")
if err != nil {
    log.Fatal(err)
}

fmt.Printf("Stuck: %d, Held: %d\n", len(result.Stuck), len(result.Held))

// Print formatted report
analyze.PrintReport(os.Stdout, result)

Events are logged as JSON:

{"type":"LOCK","state":"START","name":"my-mutex","id":1234567,"ts":1704067200000000000}
{"type":"LOCK","state":"ACQUIRED","name":"my-mutex","id":1234567,"ts":1704067200000001000}
{"type":"LOCK","state":"RELEASED","name":"my-mutex","id":1234567,"ts":1704067200000002000}

Fields:

  • type: lock type (see below)
  • state: START, ACQUIRED, or RELEASED
  • name: mutex name from WithName()
  • id: correlation ID (random, same for START/ACQUIRED/RELEASED of one lock operation)
  • ts: unix nanoseconds
  • trace: stack trace (if enabled with WithTrace())
Method Type Tracked Description
LockFunc() LOCK Yes Write lock with RELEASED tracking
RLockFunc() RLOCK Yes Read lock with RELEASED tracking
Lock() WLOCK No Write lock, no RELEASED event
RLock() RWLOCK No Read lock, no RELEASED event

Tracked types (LOCK, RLOCK) emit RELEASED events via the unlock function, so the analyzer can detect held locks. Untracked types (WLOCK, RWLOCK) are drop-in compatible with sync.Mutex/sync.RWMutex but won't be reported as "held" since there's no RELEASED event to correlate.

Use untracked methods (Lock()/RLock()) initially to detect contention, then switch to tracked methods (LockFunc()/RLockFunc()) where you need to identify which locks are being held.

  1. START: Logged before attempting to acquire the lock
  2. ACQUIRED: Logged after the lock is acquired
  3. RELEASED: Logged when the unlock function is called (only with LockFunc()/RLockFunc())

The analyzer detects:

  • Stuck: START without ACQUIRED (goroutine waiting for a lock) - all types
  • Held: ACQUIRED without RELEASED (lock not released) - tracked types only (LOCK, RLOCK)

MIT

联系我们 contact @ memedata.com