LayoutTensor ์•Œ์•„๋ณด๊ธฐ

ํผ์ฆ ํ’€์ด๋ฅผ ์ž ์‹œ ๋ฉˆ์ถ”๊ณ , GPU ํ”„๋กœ๊ทธ๋ž˜๋ฐ์„ ๋” ์ฆ๊ฒ๊ฒŒ ๋งŒ๋“ค์–ด์ค„ ๊ฐ•๋ ฅํ•œ ์ถ”์ƒํ™”๋ฅผ ๋ฏธ๋ฆฌ ์‚ดํŽด๋ด…์‹œ๋‹ค: ๐Ÿฅ โ€ฆ ๋ฐ”๋กœ LayoutTensor ์ž…๋‹ˆ๋‹ค.

๐Ÿ’ก LayoutTensor๊ฐ€ ์–ด๋–ค ์ผ์„ ํ•  ์ˆ˜ ์žˆ๋Š”์ง€ ๋ง›๋ณด๊ธฐ๋กœ ์‚ดํŽด๋ด…๋‹ˆ๋‹ค. ์ง€๊ธˆ ๋ชจ๋“  ๊ฑธ ์ดํ•ดํ•  ํ•„์š”๋Š” ์—†์–ด์š” - ํผ์ฆ์„ ์ง„ํ–‰ํ•˜๋ฉด์„œ ๊ฐ ๊ธฐ๋Šฅ์„ ์ž์„ธํžˆ ์•Œ์•„๋ณผ ๊ฒ๋‹ˆ๋‹ค.

๋ฌธ์ œ: ์ ์  ๋ณต์žกํ•ด์ง€๋Š” ์ฝ”๋“œ

์ง€๊ธˆ๊นŒ์ง€ ๊ฒช์€ ์–ด๋ ค์›€์„ ์‚ดํŽด๋ด…์‹œ๋‹ค:

# Puzzle 1: ๋‹จ์ˆœ ์ธ๋ฑ์‹ฑ
output[i] = a[i] + 10.0

# Puzzle 2: ์—ฌ๋Ÿฌ ๋ฐฐ์—ด ๊ด€๋ฆฌ
output[i] = a[i] + b[i]

# Puzzle 3: ๊ฒฝ๊ณ„ ๊ฒ€์‚ฌ
if i < size:
    output[i] = a[i] + 10.0

์ฐจ์›์ด ๋Š˜์–ด๋‚˜๋ฉด ์ฝ”๋“œ๋Š” ๋” ๋ณต์žกํ•ด์ง‘๋‹ˆ๋‹ค:

# ์ „ํ†ต์ ์ธ 2D ์ธ๋ฑ์‹ฑ (ํ–‰ ์šฐ์„  2D ํ–‰๋ ฌ)
idx = row * WIDTH + col
if row < height and col < width:
    output[idx] = a[idx] + 10.0

ํ•ด๊ฒฐ์ฑ…: LayoutTensor ๋ฏธ๋ฆฌ๋ณด๊ธฐ

LayoutTensor๋Š” ์ด๋Ÿฐ ๋ฌธ์ œ๋“ค์„ ๊น”๋”ํ•˜๊ฒŒ ํ•ด๊ฒฐํ•ด์ค๋‹ˆ๋‹ค. ์•ž์œผ๋กœ ๋ฐฐ์šธ ๋‚ด์šฉ์„ ์‚ด์ง ์—ฟ๋ณด๋ฉด:

  1. ์ž์—ฐ์Šค๋Ÿฌ์šด ์ธ๋ฑ์‹ฑ: ์ˆ˜๋™ ์˜คํ”„์…‹ ๊ณ„์‚ฐ ๋Œ€์‹  tensor[i, j] ์‚ฌ์šฉ
  2. ์œ ์—ฐํ•œ ๋ฉ”๋ชจ๋ฆฌ ๋ ˆ์ด์•„์›ƒ: ํ–‰ ์šฐ์„ , ์—ด ์šฐ์„ , ํƒ€์ผ ๊ตฌ์„ฑ ์ง€์›
  3. ์„ฑ๋Šฅ ์ตœ์ ํ™”: GPU์— ํšจ์œจ์ ์ธ ๋ฉ”๋ชจ๋ฆฌ ์ ‘๊ทผ ํŒจํ„ด

์•ž์œผ๋กœ ๋ฐฐ์šธ ๋‚ด์šฉ ๋ง›๋ณด๊ธฐ

LayoutTensor๊ฐ€ ํ•  ์ˆ˜ ์žˆ๋Š” ์ผ์„ ๋ช‡ ๊ฐ€์ง€ ์˜ˆ์‹œ๋กœ ์‚ดํŽด๋ด…์‹œ๋‹ค. ์ง€๊ธˆ ๋ชจ๋“  ์„ธ๋ถ€ ์‚ฌํ•ญ์„ ์ดํ•ดํ•  ํ•„์š”๋Š” ์—†์Šต๋‹ˆ๋‹ค - ์•ž์œผ๋กœ ๋‚˜์˜ฌ ํผ์ฆ์—์„œ ๊ฐ ๊ธฐ๋Šฅ์„ ๊ผผ๊ผผํžˆ ๋‹ค๋ฃฐ ๊ฑฐ์˜ˆ์š”.

๊ธฐ๋ณธ ์‚ฌ์šฉ ์˜ˆ์‹œ

from layout import Layout, LayoutTensor

# ๋ ˆ์ด์•„์›ƒ ์ •์˜
comptime HEIGHT = 2
comptime WIDTH = 3
comptime layout = Layout.row_major(HEIGHT, WIDTH)

# ํ…์„œ ์ƒ์„ฑ
tensor = LayoutTensor[dtype, layout](buffer.unsafe_ptr())

# ์ž์—ฐ์Šค๋Ÿฝ๊ฒŒ ์š”์†Œ ์ ‘๊ทผ
tensor[0, 0] = 1.0  # ์ฒซ ๋ฒˆ์งธ ์š”์†Œ
tensor[1, 2] = 2.0  # ๋งˆ์ง€๋ง‰ ์š”์†Œ

Layout๊ณผ LayoutTensor์— ๋Œ€ํ•ด ๋” ์•Œ์•„๋ณด๋ ค๋ฉด Mojo ๋งค๋‰ด์–ผ์˜ ๊ฐ€์ด๋“œ๋ฅผ ์ฐธ๊ณ ํ•˜์„ธ์š”:

๊ฐ„๋‹จํ•œ ์˜ˆ์ œ

LayoutTensor์˜ ๊ธฐ๋ณธ์„ ๋ณด์—ฌ์ฃผ๋Š” ๊ฐ„๋‹จํ•œ ์˜ˆ์ œ๋กœ ๋ชจ๋“  ๊ฒƒ์„ ์ •๋ฆฌํ•ด๋ด…์‹œ๋‹ค:

from gpu.host import DeviceContext
from layout import Layout, LayoutTensor

comptime HEIGHT = 2
comptime WIDTH = 3
comptime dtype = DType.float32
comptime layout = Layout.row_major(HEIGHT, WIDTH)


fn kernel[
    dtype: DType, layout: Layout
](tensor: LayoutTensor[dtype, layout, MutAnyOrigin]):
    print("Before:")
    print(tensor)
    tensor[0, 0] += 1
    print("After:")
    print(tensor)


def main() raises:
    ctx = DeviceContext()

    a = ctx.enqueue_create_buffer[dtype](HEIGHT * WIDTH)
    a.enqueue_fill(0)
    tensor = LayoutTensor[dtype, layout, MutAnyOrigin](a)
    # Note: since `tensor` is a device tensor we can't print it without the kernel wrapper
    ctx.enqueue_function[kernel[dtype, layout], kernel[dtype, layout]](
        tensor, grid_dim=1, block_dim=1
    )

    ctx.synchronize()

๋‹ค์Œ ๋ช…๋ น์–ด๋กœ ์ด ์ฝ”๋“œ๋ฅผ ์‹คํ–‰ํ•˜๋ฉด:

pixi run layout_tensor_intro
pixi run -e amd layout_tensor_intro
pixi run -e apple layout_tensor_intro
uv run poe layout_tensor_intro
Before:
0.0 0.0 0.0
0.0 0.0 0.0
After:
1.0 0.0 0.0
0.0 0.0 0.0

๋ฌด์Šจ ์ผ์ด ์ผ์–ด๋‚˜๋Š”์ง€ ์‚ดํŽด๋ด…์‹œ๋‹ค:

  1. ํ–‰ ์šฐ์„  ๋ ˆ์ด์•„์›ƒ์œผ๋กœ 2 x 3 ํ…์„œ๋ฅผ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค
  2. ์ฒ˜์Œ์—๋Š” ๋ชจ๋“  ์š”์†Œ๊ฐ€ 0์ž…๋‹ˆ๋‹ค
  3. ์ž์—ฐ์Šค๋Ÿฌ์šด ์ธ๋ฑ์‹ฑ์œผ๋กœ ํ•˜๋‚˜์˜ ์š”์†Œ๋ฅผ ์ˆ˜์ •ํ•ฉ๋‹ˆ๋‹ค
  4. ๋ณ€๊ฒฝ ์‚ฌํ•ญ์ด ์ถœ๋ ฅ์— ๋ฐ˜์˜๋ฉ๋‹ˆ๋‹ค

์ด ๊ฐ„๋‹จํ•œ ์˜ˆ์ œ๋Š” LayoutTensor์˜ ํ•ต์‹ฌ ์žฅ์ ์„ ๋ณด์—ฌ์ค๋‹ˆ๋‹ค:

  • ํ…์„œ ์ƒ์„ฑ๊ณผ ์ ‘๊ทผ์„ ์œ„ํ•œ ๊น”๋”ํ•œ ๋ฌธ๋ฒ•
  • ์ž๋™ ๋ฉ”๋ชจ๋ฆฌ ๋ ˆ์ด์•„์›ƒ ์ฒ˜๋ฆฌ
  • ์ž์—ฐ์Šค๋Ÿฌ์šด ๋‹ค์ฐจ์› ์ธ๋ฑ์‹ฑ

์ด ์˜ˆ์ œ๋Š” ๊ฐ„๋‹จํ•˜์ง€๋งŒ, ๊ฐ™์€ ํŒจํ„ด์ด ์•ž์œผ๋กœ ๋‚˜์˜ฌ ํผ์ฆ์˜ ๋ณต์žกํ•œ GPU ์—ฐ์‚ฐ์—๋„ ๊ทธ๋Œ€๋กœ ์ ์šฉ๋ฉ๋‹ˆ๋‹ค. ์ด๋Ÿฐ ๊ธฐ๋ณธ ๊ฐœ๋…์ด ๋‹ค์Œ์œผ๋กœ ์–ด๋–ป๊ฒŒ ํ™•์žฅ๋˜๋Š”์ง€ ๋ณด๊ฒŒ ๋  ๊ฑฐ์˜ˆ์š”:

  • ๋ฉ€ํ‹ฐ ์Šค๋ ˆ๋“œ GPU ์—ฐ์‚ฐ
  • ๊ณต์œ  ๋ฉ”๋ชจ๋ฆฌ ์ตœ์ ํ™”
  • ๋ณต์žกํ•œ ํƒ€์ผ๋ง ์ „๋žต
  • ํ•˜๋“œ์›จ์–ด ๊ฐ€์† ์—ฐ์‚ฐ

LayoutTensor์™€ ํ•จ๊ป˜ GPU ํ”„๋กœ๊ทธ๋ž˜๋ฐ ์—ฌ์ •์„ ์‹œ์ž‘ํ•  ์ค€๋น„๊ฐ€ ๋๋‚˜์š”? ํผ์ฆ๋กœ ๋“ค์–ด๊ฐ€๋ด…์‹œ๋‹ค!

๐Ÿ’ก ํŒ: ์ง„ํ–‰ํ•˜๋ฉด์„œ ์ด ์˜ˆ์ œ๋ฅผ ๊ธฐ์–ตํ•ด๋‘์„ธ์š” - ์ด ๊ธฐ๋ณธ ๊ฐœ๋…์„ ๋ฐ”ํƒ•์œผ๋กœ ์ ์  ๋” ์ •๊ตํ•œ GPU ํ”„๋กœ๊ทธ๋žจ์„ ๋งŒ๋“ค์–ด๊ฐˆ ๊ฒ๋‹ˆ๋‹ค.