๐Ÿ‘ฎ๐Ÿผโ€โ™‚๏ธ ๋ฉ”๋ชจ๋ฆฌ ์œ„๋ฐ˜ ํƒ์ง€

๊ฐœ์š”

ํ…Œ์ŠคํŠธ๊ฐ€ ํ†ต๊ณผํ•˜๋Š” ๊ฒƒ์ฒ˜๋Ÿผ ๋ณด์—ฌ๋„ GPU ํ”„๋กœ๊ทธ๋žจ์„ ์กฐ์šฉํžˆ ์†์ƒ์‹œํ‚ฌ ์ˆ˜ ์žˆ๋Š” ๋ฉ”๋ชจ๋ฆฌ ์œ„๋ฐ˜์„ ํƒ์ง€ํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ๋ฐฐ์›๋‹ˆ๋‹ค. NVIDIA์˜ compute-sanitizer(pixi๋ฅผ ํ†ตํ•ด ์‚ฌ์šฉ ๊ฐ€๋Šฅ)์™€ memcheck ๋„๊ตฌ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ, GPU ์ฝ”๋“œ์—์„œ ์˜ˆ์ธก ๋ถˆ๊ฐ€๋Šฅํ•œ ๋™์ž‘์„ ์ผ์œผํ‚ฌ ์ˆ˜ ์žˆ๋Š” ์ˆจ์€ ๋ฉ”๋ชจ๋ฆฌ ๋ฒ„๊ทธ๋ฅผ ๋ฐœ๊ฒฌํ•˜๊ฒŒ ๋ฉ๋‹ˆ๋‹ค.

ํ•ต์‹ฌ ํ†ต์ฐฐ: GPU ํ”„๋กœ๊ทธ๋žจ์€ ๋ถˆ๋ฒ•์ ์ธ ๋ฉ”๋ชจ๋ฆฌ ์ ‘๊ทผ์„ ์ˆ˜ํ–‰ํ•˜๋ฉด์„œ๋„ ๋™์‹œ์— โ€œ์˜ฌ๋ฐ”๋ฅธโ€ ๊ฒฐ๊ณผ๋ฅผ ๋งŒ๋“ค์–ด๋‚ผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

์„ ํ–‰ ํ•™์Šต: Puzzle 4 LayoutTensor์™€ ๊ธฐ๋ณธ์ ์ธ GPU ๋ฉ”๋ชจ๋ฆฌ ๊ฐœ๋…์— ๋Œ€ํ•œ ์ดํ•ด๊ฐ€ ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค.

์กฐ์šฉํ•œ ๋ฉ”๋ชจ๋ฆฌ ๋ฒ„๊ทธ์˜ ๋ฐœ๊ฒฌ

ํ…Œ์ŠคํŠธ๋Š” ํ†ต๊ณผํ–ˆ์ง€๋งŒ, ์ฝ”๋“œ๊ฐ€ ์ •๋ง ์˜ฌ๋ฐ”๋ฅธ ๊ฑธ๊นŒ?

์–ผํ• ๋ฌดํ•ดํ•ด ๋ณด์ด๊ณ  ์™„๋ฒฝํ•˜๊ฒŒ ๋™์ž‘ํ•˜๋Š” ๋“ฏํ•œ ํ”„๋กœ๊ทธ๋žจ์œผ๋กœ ์‹œ์ž‘ํ•ด ๋ด…์‹œ๋‹ค (๊ฐ€๋“œ๊ฐ€ ์—†๋Š” Puzzle 04์ž…๋‹ˆ๋‹ค):

fn add_10_2d(
    output: LayoutTensor[dtype, layout, MutAnyOrigin],
    a: LayoutTensor[dtype, layout, ImmutAnyOrigin],
    size: UInt,
):
    row = thread_idx.y
    col = thread_idx.x
    output[row, col] = a[row, col] + 10.0


์ „์ฒด ํŒŒ์ผ ๋ณด๊ธฐ: problems/p10/p10.mojo

์ด ํ”„๋กœ๊ทธ๋žจ์„ ์ผ๋ฐ˜์ ์œผ๋กœ ์‹คํ–‰ํ•˜๋ฉด, ๋ชจ๋“  ๊ฒƒ์ด ์ •์ƒ์œผ๋กœ ๋ณด์ž…๋‹ˆ๋‹ค:

pixi run p10 --memory-bug
out shape: 2 x 2
Running memory bug example (bounds checking issue)...
out: HostBuffer([10.0, 11.0, 12.0, 13.0])
expected: HostBuffer([10.0, 11.0, 12.0, 13.0])
โœ… Memory test PASSED! (memcheck may find bounds violations)

โœ… ํ…Œ์ŠคํŠธ ํ†ต๊ณผ! ์ถœ๋ ฅ์ด ์˜ˆ์ƒ ๊ฒฐ๊ณผ์™€ ์™„๋ฒฝํ•˜๊ฒŒ ์ผ์น˜ํ•ฉ๋‹ˆ๋‹ค. ์‚ฌ๊ฑด ์ข…๊ฒฐ, ๋งž์ฃ ?

์•„๋‹™๋‹ˆ๋‹ค! compute-sanitizer๊ฐ€ ๋ฌด์—‡์„ ๋ณด์—ฌ์ฃผ๋Š”์ง€ ๋ด…์‹œ๋‹ค:

MODULAR_DEVICE_CONTEXT_MEMORY_MANAGER_SIZE_PERCENT=0 pixi run compute-sanitizer --tool memcheck mojo problems/p10/p10.mojo --memory-bug

์ฐธ๊ณ : MODULAR_DEVICE_CONTEXT_MEMORY_MANAGER_SIZE_PERCENT=0์€ ๋””๋ฐ”์ด์Šค ์ปจํ…์ŠคํŠธ์˜ ๋ฒ„ํผ ์บ์‹œ๋ฅผ ๋น„ํ™œ์„ฑํ™”ํ•˜๋Š” ๋ช…๋ น์ค„ ํ™˜๊ฒฝ ๋ณ€์ˆ˜ ์„ค์ •์ž…๋‹ˆ๋‹ค. ์ด ์„ค์ •์€ ์ผ๋ฐ˜์ ์ธ ์บ์‹ฑ ๋™์ž‘์— ์˜ํ•ด ์ˆจ๊ฒจ์ง€๋˜ ๊ฒฝ๊ณ„ ์œ„๋ฐ˜ ๊ฐ™์€ ๋ฉ”๋ชจ๋ฆฌ ๋ฌธ์ œ๋ฅผ ๋“œ๋Ÿฌ๋‚ผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. (์—ญ์ฃผ: ๋ฒ„ํผ ์บ์‹œ๊ฐ€ ํ™œ์„ฑํ™”๋˜๋ฉด ํ•ด์ œ๋œ ๋ฉ”๋ชจ๋ฆฌ๋ฅผ ์ฆ‰์‹œ ๋ฐ˜ํ™˜ํ•˜์ง€ ์•Š๊ณ  ์žฌ์‚ฌ์šฉ์„ ์œ„ํ•ด ๋ณด๊ด€ํ•ฉ๋‹ˆ๋‹ค. ์ด ๋•Œ๋ฌธ์— ๋ฒ”์œ„๋ฅผ ๋ฒ—์–ด๋‚œ ์ ‘๊ทผ์ด ์•„์ง ์œ ํšจํ•œ ์บ์‹œ ์˜์—ญ์— ๋‹ฟ์•„ ์˜ค๋ฅ˜๊ฐ€ ๋“œ๋Ÿฌ๋‚˜์ง€ ์•Š์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๋น„ํ™œ์„ฑํ™”ํ•˜๋ฉด ๋ฉ”๋ชจ๋ฆฌ๊ฐ€ ์ฆ‰์‹œ ๋ฐ˜ํ™˜๋˜์–ด ์œ„๋ฐ˜์ด ๊ฐ์ง€๋ฉ๋‹ˆ๋‹ค.)

========= COMPUTE-SANITIZER
out shape: 2 x 2
Running memory bug example (bounds checking issue)...

========= Invalid __global__ read of size 4 bytes
=========     at p10_add_10_2d_...+0x80
=========     by thread (2,1,0) in block (0,0,0)
=========     Access at 0xe0c000210 is out of bounds
=========     and is 513 bytes after the nearest allocation at 0xe0c000000 of size 16 bytes

========= Invalid __global__ read of size 4 bytes
=========     at p10_add_10_2d_...+0x80
=========     by thread (0,2,0) in block (0,0,0)
=========     Access at 0xe0c000210 is out of bounds
=========     and is 513 bytes after the nearest allocation at 0xe0c000000 of size 16 bytes

========= Invalid __global__ read of size 4 bytes
=========     at p10_add_10_2d_...+0x80
=========     by thread (1,2,0) in block (0,0,0)
=========     Access at 0xe0c000214 is out of bounds
=========     and is 517 bytes after the nearest allocation at 0xe0c000000 of size 16 bytes

========= Invalid __global__ read of size 4 bytes
=========     at p10_add_10_2d_...+0x80
=========     by thread (2,2,0) in block (0,0,0)
=========     Access at 0xe0c000218 is out of bounds
=========     and is 521 bytes after the nearest allocation at 0xe0c000000 of size 16 bytes

========= Program hit CUDA_ERROR_LAUNCH_FAILED (error 719) due to "unspecified launch failure" on CUDA API call to cuStreamSynchronize.
========= Program hit CUDA_ERROR_LAUNCH_FAILED (error 719) due to "unspecified launch failure" on CUDA API call to cuEventCreate.
========= Program hit CUDA_ERROR_LAUNCH_FAILED (error 719) due to "unspecified launch failure" on CUDA API call to cuMemFreeAsync.

========= ERROR SUMMARY: 7 errors

๋ชจ๋“  ํ…Œ์ŠคํŠธ๋ฅผ ํ†ต๊ณผํ–ˆ์Œ์—๋„ ํ”„๋กœ๊ทธ๋žจ์—๋Š” ์ด 7๊ฐœ์˜ ์˜ค๋ฅ˜๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค:

  • 4๊ฐœ์˜ ๋ฉ”๋ชจ๋ฆฌ ์œ„๋ฐ˜ (Invalid __global__ read)
  • 3๊ฐœ์˜ ๋Ÿฐํƒ€์ž„ ์˜ค๋ฅ˜ (๋ฉ”๋ชจ๋ฆฌ ์œ„๋ฐ˜์œผ๋กœ ์ธํ•ด ๋ฐœ์ƒ)

์ˆจ๊ฒจ์ง„ ๋ฒ„๊ทธ ์ดํ•ดํ•˜๊ธฐ

๊ทผ๋ณธ ์›์ธ ๋ถ„์„

๋ฌธ์ œ:

  • ํ…์„œ ํฌ๊ธฐ: 2ร—2 (์œ ํšจํ•œ ์ธ๋ฑ์Šค: 0, 1)
  • ์Šค๋ ˆ๋“œ ๊ทธ๋ฆฌ๋“œ: 3ร—3 (์Šค๋ ˆ๋“œ ์ธ๋ฑ์Šค: 0, 1, 2)
  • ๋ฒ”์œ„ ์ดˆ๊ณผ ์Šค๋ ˆ๋“œ: (2,1), (0,2), (1,2), (2,2)๊ฐ€ ์ž˜๋ชป๋œ ๋ฉ”๋ชจ๋ฆฌ์— ์ ‘๊ทผ
  • ๊ฒฝ๊ณ„ ๊ฒ€์‚ฌ ๋ˆ„๋ฝ: ํ…์„œ ์ฐจ์›์— ๋Œ€ํ•œ thread_idx ๊ฒ€์ฆ์ด ์—†์Œ

7๊ฐœ ์˜ค๋ฅ˜ ์ „์ฒด ์ดํ•ดํ•˜๊ธฐ

4๊ฐœ์˜ ๋ฉ”๋ชจ๋ฆฌ ์œ„๋ฐ˜:

  • ๊ฐ ๋ฒ”์œ„ ์ดˆ๊ณผ ์Šค๋ ˆ๋“œ (2,1), (0,2), (1,2), (2,2)๊ฐ€ Invalid __global__ read๋ฅผ ๋ฐœ์ƒ์‹œํ‚ด

3๊ฐœ์˜ CUDA ๋Ÿฐํƒ€์ž„ ์˜ค๋ฅ˜:

  • ์ปค๋„ ์‹คํ–‰ ์‹คํŒจ๋กœ ์ธํ•ด cuStreamSynchronize ์‹คํŒจ
  • ์ •๋ฆฌ ๊ณผ์ •์—์„œ cuEventCreate ์‹คํŒจ
  • ๋ฉ”๋ชจ๋ฆฌ ํ•ด์ œ ๊ณผ์ •์—์„œ cuMemFreeAsync ์‹คํŒจ

ํ•ต์‹ฌ ํ†ต์ฐฐ: ๋ฉ”๋ชจ๋ฆฌ ์œ„๋ฐ˜์€ ์—ฐ์‡„ ํšจ๊ณผ๋ฅผ ์ผ์œผํ‚ต๋‹ˆ๋‹ค - ํ•˜๋‚˜์˜ ์ž˜๋ชป๋œ ๋ฉ”๋ชจ๋ฆฌ ์ ‘๊ทผ์ด ์—ฌ๋Ÿฌ ํ›„์† CUDA API ์‹คํŒจ๋ฅผ ์•ผ๊ธฐํ•ฉ๋‹ˆ๋‹ค.

๊ทธ๋Ÿผ์—๋„ ํ…Œ์ŠคํŠธ๊ฐ€ ํ†ต๊ณผํ•œ ์ด์œ :

  • ์œ ํšจํ•œ ์Šค๋ ˆ๋“œ (0,0), (0,1), (1,0), (1,1)์ด ์˜ฌ๋ฐ”๋ฅธ ๊ฒฐ๊ณผ๋ฅผ ๊ธฐ๋กํ•จ
  • ํ…Œ์ŠคํŠธ๊ฐ€ ์œ ํšจํ•œ ์ถœ๋ ฅ ์œ„์น˜๋งŒ ๊ฒ€์‚ฌํ•จ
  • ๋ฒ”์œ„ ์ดˆ๊ณผ ์ ‘๊ทผ์ด ํ”„๋กœ๊ทธ๋žจ์„ ์ฆ‰์‹œ ํฌ๋ž˜์‹œ์‹œํ‚ค์ง€ ์•Š์Œ

๋ฏธ์ •์˜ ๋™์ž‘ ์ดํ•ดํ•˜๊ธฐ

๋ฏธ์ •์˜ ๋™์ž‘์ด๋ž€?

๋ฏธ์ •์˜ ๋™์ž‘(Undefined Behavior, UB) ์€ ํ”„๋กœ๊ทธ๋žจ์ด ์–ธ์–ด ๋ช…์„ธ์ƒ ์ •์˜๋˜์ง€ ์•Š์€ ์—ฐ์‚ฐ์„ ์ˆ˜ํ–‰ํ•  ๋•Œ ๋ฐœ์ƒํ•ฉ๋‹ˆ๋‹ค. ๋ฒ”์œ„ ์ดˆ๊ณผ ๋ฉ”๋ชจ๋ฆฌ ์ ‘๊ทผ์ด ๋Œ€ํ‘œ์ ์ธ ์˜ˆ์ž…๋‹ˆ๋‹ค.

๋ฏธ์ •์˜ ๋™์ž‘์˜ ์ฃผ์š” ํŠน์„ฑ:

  • ํ”„๋กœ๊ทธ๋žจ์ด ๋ง ๊ทธ๋Œ€๋กœ ๋ฌด์Šจ ์ง“์ด๋“  ํ•  ์ˆ˜ ์žˆ์Œ: ํฌ๋ž˜์‹œ, ์ž˜๋ชป๋œ ๊ฒฐ๊ณผ, ์ •์ƒ ๋™์ž‘ํ•˜๋Š” ๊ฒƒ์ฒ˜๋Ÿผ ๋ณด์ด๊ธฐ, ๋ฉ”๋ชจ๋ฆฌ ์†์ƒ
  • ์–ด๋–ค ๋ณด์žฅ๋„ ์—†์Œ: ์ปดํŒŒ์ผ๋Ÿฌ, ํ•˜๋“œ์›จ์–ด, ๋“œ๋ผ์ด๋ฒ„, ์‹ฌ์ง€์–ด ์‹คํ–‰ํ•  ๋•Œ๋งˆ๋‹ค ๋™์ž‘์ด ๋‹ฌ๋ผ์งˆ ์ˆ˜ ์žˆ์Œ

๋ฏธ์ •์˜ ๋™์ž‘์ด ํŠนํžˆ ์œ„ํ—˜ํ•œ ์ด์œ 

์ •ํ™•์„ฑ ๋ฌธ์ œ:

  • ์˜ˆ์ธก ๋ถˆ๊ฐ€๋Šฅํ•œ ๊ฒฐ๊ณผ: ํ…Œ์ŠคํŠธ ์ค‘์—๋Š” ๋™์ž‘ํ•˜๋‹ค๊ฐ€ ํ”„๋กœ๋•์…˜์—์„œ ์‹คํŒจํ•  ์ˆ˜ ์žˆ์Œ
  • ๋น„๊ฒฐ์ •์  ๋™์ž‘: ๊ฐ™์€ ์ฝ”๋“œ๊ฐ€ ๋‹ค๋ฅธ ์‹คํ–‰์—์„œ ๋‹ค๋ฅธ ๊ฒฐ๊ณผ๋ฅผ ๋‚ผ ์ˆ˜ ์žˆ์Œ
  • ์กฐ์šฉํ•œ ์†์ƒ: ๋ฏธ์ •์˜ ๋™์ž‘์€ ๊ฐ€์‹œ์ ์ธ ์˜ค๋ฅ˜ ์—†์ด ๋ฐ์ดํ„ฐ๋ฅผ ์†์ƒ์‹œํ‚ฌ ์ˆ˜ ์žˆ์Œ
  • ์ปดํŒŒ์ผ๋Ÿฌ ์ตœ์ ํ™”: ์ปดํŒŒ์ผ๋Ÿฌ๋Š” ๋ฏธ์ •์˜ ๋™์ž‘์ด ์—†๋‹ค๊ณ  ๊ฐ€์ •ํ•˜๊ณ  ์˜ˆ์ƒ์น˜ ๋ชปํ•œ ๋ฐฉ์‹์œผ๋กœ ์ตœ์ ํ™”ํ•  ์ˆ˜ ์žˆ์Œ

๋ณด์•ˆ ์ทจ์•ฝ์ :

  • ๋ฒ„ํผ ์˜ค๋ฒ„ํ”Œ๋กœ์šฐ: ์‹œ์Šคํ…œ ํ”„๋กœ๊ทธ๋ž˜๋ฐ์—์„œ ๋ณด์•ˆ ๊ณต๊ฒฉ์˜ ๊ณ ์ „์ ์ธ ์›์ธ
  • ๋ฉ”๋ชจ๋ฆฌ ์†์ƒ: ๊ถŒํ•œ ์ƒ์Šน์ด๋‚˜ ์ฝ”๋“œ ์ธ์ ์…˜ ๊ณต๊ฒฉ์œผ๋กœ ์ด์–ด์งˆ ์ˆ˜ ์žˆ์Œ
  • ์ •๋ณด ์œ ์ถœ: ๋ฒ”์œ„๋ฅผ ๋ฒ—์–ด๋‚œ ์ฝ๊ธฐ๋กœ ๋ฏผ๊ฐํ•œ ๋ฐ์ดํ„ฐ๊ฐ€ ๋…ธ์ถœ๋  ์ˆ˜ ์žˆ์Œ
  • ์ œ์–ด ํ๋ฆ„ ํ•˜์ด์žฌํ‚น: ๋ฏธ์ •์˜ ๋™์ž‘์„ ์•…์šฉํ•ด ํ”„๋กœ๊ทธ๋žจ ์‹คํ–‰ ํ๋ฆ„์„ ํƒˆ์ทจํ•  ์ˆ˜ ์žˆ์Œ

GPU ํŠน์œ ์˜ ๋ฏธ์ •์˜ ๋™์ž‘ ์œ„ํ—˜์„ฑ

๋Œ€๊ทœ๋ชจ ์˜ํ–ฅ:

  • ์Šค๋ ˆ๋“œ ๋ถ„๊ธฐ: ํ•œ ์Šค๋ ˆ๋“œ์˜ ๋ฏธ์ •์˜ ๋™์ž‘์ด ์ „์ฒด ์›Œํ”„(32๊ฐœ ์Šค๋ ˆ๋“œ)์— ์˜ํ–ฅ์„ ์ค„ ์ˆ˜ ์žˆ์Œ
  • ๋ฉ”๋ชจ๋ฆฌ ๋ณ‘ํ•ฉ: ๋ฒ”์œ„ ์ดˆ๊ณผ ์ ‘๊ทผ์ด ์ธ์ ‘ ์Šค๋ ˆ๋“œ์˜ ๋ฐ์ดํ„ฐ๋ฅผ ์†์ƒ์‹œํ‚ฌ ์ˆ˜ ์žˆ์Œ
  • ์ปค๋„ ์‹คํŒจ: ๋ฏธ์ •์˜ ๋™์ž‘์ด GPU ์ปค๋„ ์ „์ฒด๋ฅผ ์™„์ „ํžˆ ๋ง๊ฐ€๋œจ๋ฆด ์ˆ˜ ์žˆ์Œ

ํ•˜๋“œ์›จ์–ด ์ฐจ์ด:

  • ๋‹ค๋ฅธ GPU ์•„ํ‚คํ…์ฒ˜: ๋ฏธ์ •์˜ ๋™์ž‘์ด ๋‹ค๋ฅธ GPU ๋ชจ๋ธ์—์„œ ๋‹ค๋ฅด๊ฒŒ ๋‚˜ํƒ€๋‚  ์ˆ˜ ์žˆ์Œ
  • ๋“œ๋ผ์ด๋ฒ„ ์ฐจ์ด: ๊ฐ™์€ ๋ฏธ์ •์˜ ๋™์ž‘์ด ๋“œ๋ผ์ด๋ฒ„ ๋ฒ„์ „์— ๋”ฐ๋ผ ๋‹ค๋ฅด๊ฒŒ ๋™์ž‘ํ•  ์ˆ˜ ์žˆ์Œ
  • ๋ฉ”๋ชจ๋ฆฌ ๋ ˆ์ด์•„์›ƒ ๋ณ€๊ฒฝ: GPU ๋ฉ”๋ชจ๋ฆฌ ํ• ๋‹น ํŒจํ„ด์— ๋”ฐ๋ผ ๋ฏธ์ •์˜ ๋™์ž‘์ด ๋‹ค๋ฅด๊ฒŒ ๋‚˜ํƒ€๋‚  ์ˆ˜ ์žˆ์Œ

๋ฉ”๋ชจ๋ฆฌ ์œ„๋ฐ˜ ์ˆ˜์ •ํ•˜๊ธฐ

ํ•ด๊ฒฐ์ฑ…

Puzzle 04์—์„œ ๋ณธ ๊ฒƒ์ฒ˜๋Ÿผ, ๋‹ค์Œ๊ณผ ๊ฐ™์ด ๊ฒฝ๊ณ„ ๊ฒ€์‚ฌ๋ฅผ ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค:

fn add_10_2d(
    output: LayoutTensor[dtype, layout, MutAnyOrigin],
    a: LayoutTensor[dtype, layout, MutAnyOrigin],
    size: UInt,
):
    row = thread_idx.y
    col = thread_idx.x
    if col < size and row < size:
        output[row, col] = a[row, col] + 10.0


ํ•ด๊ฒฐ์ฑ…์€ ๊ฐ„๋‹จํ•ฉ๋‹ˆ๋‹ค: ๋ฉ”๋ชจ๋ฆฌ์— ์ ‘๊ทผํ•˜๊ธฐ ์ „์— ํ•ญ์ƒ ์Šค๋ ˆ๋“œ ์ธ๋ฑ์Šค๋ฅผ ๋ฐ์ดํ„ฐ ์ฐจ์›์— ๋Œ€ํ•ด ๊ฒ€์ฆํ•˜์„ธ์š”.

compute-sanitizer๋กœ ๊ฒ€์ฆ

# p10.mojo ๋ณต์‚ฌ๋ณธ์—์„œ ๊ฒฝ๊ณ„ ๊ฒ€์‚ฌ๋ฅผ ์ˆ˜์ •ํ•œ ํ›„ ์‹คํ–‰:
MODULAR_DEVICE_CONTEXT_MEMORY_MANAGER_SIZE_PERCENT=0 pixi run compute-sanitizer --tool memcheck mojo problems/p10/p10.mojo --memory-bug
========= COMPUTE-SANITIZER
out shape: 2 x 2
Running memory bug example (bounds checking issue)...
out: HostBuffer([10.0, 11.0, 12.0, 13.0])
expected: HostBuffer([10.0, 11.0, 12.0, 13.0])
โœ… Memory test PASSED! (memcheck may find bounds violations)
========= ERROR SUMMARY: 0 errors

โœ… ์„ฑ๊ณต: ๋ฉ”๋ชจ๋ฆฌ ์œ„๋ฐ˜์ด ํƒ์ง€๋˜์ง€ ์•Š์•˜์Šต๋‹ˆ๋‹ค!

ํ•ต์‹ฌ ํ•™์Šต ํฌ์ธํŠธ

์ˆ˜๋™ ๊ฒฝ๊ณ„ ๊ฒ€์‚ฌ๊ฐ€ ์ค‘์š”ํ•œ ์ด์œ 

  1. ๋ช…ํ™•์„ฑ: ์ฝ”๋“œ์—์„œ ์•ˆ์ „ ์š”๊ตฌ์‚ฌํ•ญ์„ ๋ช…์‹œ์ ์œผ๋กœ ํ‘œํ˜„
  2. ์ œ์–ด: ๋ฒ”์œ„ ์ดˆ๊ณผ ์ผ€์ด์Šค์—์„œ ์ •ํ™•ํžˆ ์–ด๋–ค ์ผ์ด ์ผ์–ด๋‚ ์ง€ ์ง์ ‘ ๊ฒฐ์ •
  3. ๋””๋ฒ„๊น…: ๋ฉ”๋ชจ๋ฆฌ ์œ„๋ฐ˜์ด ๋ฐœ์ƒํ•  ๋•Œ ์ถ”๋ก ํ•˜๊ธฐ ์‰ฌ์›€

GPU ๋ฉ”๋ชจ๋ฆฌ ์•ˆ์ „ ๊ทœ์น™

  1. ํ•ญ์ƒ ์Šค๋ ˆ๋“œ ์ธ๋ฑ์Šค๋ฅผ ๊ฒ€์ฆํ•˜์—ฌ ๋ฐ์ดํ„ฐ ์ฐจ์›๊ณผ ๋น„๊ต
  2. ๋ฏธ์ •์˜ ๋™์ž‘์„ ์–ด๋–ค ๋Œ€๊ฐ€๋ฅผ ์น˜๋ฅด๋”๋ผ๋„ ํ”ผํ•˜๊ธฐ - ๋ฒ”์œ„ ์ดˆ๊ณผ ์ ‘๊ทผ์€ ๋ฏธ์ •์˜ ๋™์ž‘์ด๋ฉฐ ๋ชจ๋“  ๊ฒƒ์„ ๋ง๊ฐ€๋œจ๋ฆด ์ˆ˜ ์žˆ์Œ
  3. ๊ฐœ๋ฐœ๊ณผ ํ…Œ์ŠคํŠธ ์ค‘ compute-sanitizer ์‚ฌ์šฉ
  4. ๋ฉ”๋ชจ๋ฆฌ ๊ฒ€์‚ฌ ์—†์ด โ€œ๋™์ž‘ํ•œ๋‹คโ€œ๊ณ  ์ ˆ๋Œ€ ๊ฐ€์ •ํ•˜์ง€ ์•Š๊ธฐ
  5. ๋‹ค์–‘ํ•œ ๊ทธ๋ฆฌ๋“œ/๋ธ”๋ก ๊ตฌ์„ฑ์œผ๋กœ ํ…Œ์ŠคํŠธํ•˜์—ฌ ์ผ๊ด€์„ฑ ์—†์ด ๋‚˜ํƒ€๋‚˜๋Š” ๋ฏธ์ •์˜ ๋™์ž‘ ํฌ์ฐฉ

compute-sanitizer ๋ชจ๋ฒ” ์‚ฌ๋ก€

MODULAR_DEVICE_CONTEXT_MEMORY_MANAGER_SIZE_PERCENT=0 pixi run compute-sanitizer --tool memcheck mojo your_code.mojo

์ฐธ๊ณ : ์ƒˆ๋‹ˆํƒ€์ด์ € ์ถœ๋ ฅ์—์„œ Mojo ๋Ÿฐํƒ€์ž„ ๊ฒฝ๊ณ ๋ฅผ ๋ณผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์‹ค์ œ ๋ฉ”๋ชจ๋ฆฌ ์œ„๋ฐ˜์„ ํ™•์ธํ•˜๋ ค๋ฉด ========= Invalid์™€ ========= ERROR SUMMARY ๋ผ์ธ์— ์ง‘์ค‘ํ•˜์„ธ์š”.