Puzzle 7: 2D ๋ธ”๋ก

๊ฐœ์š”

2D TileTensor a์˜ ๊ฐ ์œ„์น˜์— 10์„ ๋”ํ•ด 2D TileTensor output์— ์ €์žฅํ•˜๋Š” ์ปค๋„์„ ๊ตฌํ˜„ํ•ด ๋ณด์„ธ์š”.

์ฐธ๊ณ : ๋ธ”๋ก๋‹น ์Šค๋ ˆ๋“œ ์ˆ˜๊ฐ€ a์˜ ํ–‰๊ณผ ์—ด ํฌ๊ธฐ๋ณด๋‹ค ๋ชจ๋‘ ์ž‘์Šต๋‹ˆ๋‹ค.

2D Blocks ์‹œ๊ฐํ™” 2D Blocks ์‹œ๊ฐํ™”

ํ•ต์‹ฌ ๊ฐœ๋…

์ด ํผ์ฆ์—์„œ ๋ฐฐ์šธ ๋‚ด์šฉ:

  • ์—ฌ๋Ÿฌ ๋ธ”๋ก๊ณผ ํ•จ๊ป˜ TileTensor ์‚ฌ์šฉํ•˜๊ธฐ
  • 2D ๋ธ”๋ก ๊ตฌ์„ฑ์œผ๋กœ ํฐ ํ–‰๋ ฌ ์ฒ˜๋ฆฌํ•˜๊ธฐ
  • ๋ธ”๋ก ์ธ๋ฑ์‹ฑ๊ณผ TileTensor ์ ‘๊ทผ ๊ฒฐํ•ฉํ•˜๊ธฐ

ํ•ต์‹ฌ์€ TileTensor๊ฐ€ 2D ์ธ๋ฑ์‹ฑ์„ ๋‹จ์ˆœํ™”ํ•ด ์ฃผ์ง€๋งŒ, ํฐ ํ–‰๋ ฌ์—์„œ๋Š” ์—ฌ์ „ํžˆ ๋ธ”๋ก ๊ฐ„ ์กฐ์œจ์ด ํ•„์š”ํ•˜๋‹ค๋Š” ์ ์ž…๋‹ˆ๋‹ค.

๐Ÿ”‘ 2D ์Šค๋ ˆ๋“œ ์ธ๋ฑ์‹ฑ ๋ฐฉ์‹

Puzzle 4: 2D Map์˜ ๋ธ”๋ก ๊ธฐ๋ฐ˜ ์ธ๋ฑ์‹ฑ์„ 2D๋กœ ํ™•์žฅํ•ฉ๋‹ˆ๋‹ค:

์ „์—ญ ์œ„์น˜ ๊ณ„์‚ฐ:
row = block_dim.y * block_idx.y + thread_idx.y
col = block_dim.x * block_idx.x + thread_idx.x

์˜ˆ๋ฅผ ๋“ค์–ด, 4ร—4 ๊ทธ๋ฆฌ๋“œ์—์„œ 2ร—2 ๋ธ”๋ก์„ ์‚ฌ์šฉํ•˜๋ฉด:

Block (0,0):   Block (1,0):
[0,0  0,1]     [0,2  0,3]
[1,0  1,1]     [1,2  1,3]

Block (0,1):   Block (1,1):
[2,0  2,1]     [2,2  2,3]
[3,0  3,1]     [3,2  3,3]

๊ฐ ์œ„์น˜๋Š” ํ•ด๋‹น ์Šค๋ ˆ๋“œ์˜ ์ „์—ญ ์ธ๋ฑ์Šค (row, col)๋ฅผ ๋‚˜ํƒ€๋ƒ…๋‹ˆ๋‹ค. ๋ธ”๋ก ์ฐจ์›๊ณผ ์ธ๋ฑ์Šค๊ฐ€ ํ•จ๊ป˜ ์ž‘๋™ํ•˜์—ฌ ๋‹ค์Œ์„ ๋ณด์žฅํ•ฉ๋‹ˆ๋‹ค:

  • 2D ๊ณต๊ฐ„ ์ „์ฒด๋ฅผ ๋นˆํ‹ˆ์—†์ด ์ฒ˜๋ฆฌ
  • ๋ธ”๋ก ๊ฐ„ ๊ฒน์นจ ์—†์Œ
  • ํšจ์œจ์ ์ธ ๋ฉ”๋ชจ๋ฆฌ ์ ‘๊ทผ ํŒจํ„ด

๊ตฌ์„ฑ

  • ํ–‰๋ ฌ ํฌ๊ธฐ: \(5 \times 5\) ์›์†Œ
  • ๋ ˆ์ด์•„์›ƒ ์ฒ˜๋ฆฌ: TileTensor๊ฐ€ ํ–‰ ์šฐ์„  ๊ตฌ์„ฑ ๊ด€๋ฆฌ
  • ๋ธ”๋ก ์กฐ์œจ: ์—ฌ๋Ÿฌ ๋ธ”๋ก์œผ๋กœ ์ „์ฒด ํ–‰๋ ฌ ์ปค๋ฒ„
  • 2D ์ธ๋ฑ์‹ฑ: ๊ฒฝ๊ณ„ ๊ฒ€์‚ฌ์™€ ํ•จ๊ป˜ ์ž์—ฐ์Šค๋Ÿฌ์šด \((i,j)\) ์ ‘๊ทผ
  • ์ด ์Šค๋ ˆ๋“œ ์ˆ˜: \(25\)๊ฐœ ์›์†Œ์— ๋Œ€ํ•ด \(36\)๊ฐœ
  • ์Šค๋ ˆ๋“œ ๋งคํ•‘: ๊ฐ ์Šค๋ ˆ๋“œ๊ฐ€ ํ–‰๋ ฌ ์›์†Œ ํ•˜๋‚˜์”ฉ ์ฒ˜๋ฆฌ

์™„์„ฑํ•  ์ฝ”๋“œ

comptime SIZE = 5
comptime BLOCKS_PER_GRID = (2, 2)
comptime THREADS_PER_BLOCK = (3, 3)
comptime dtype = DType.float32
comptime out_layout = row_major[SIZE, SIZE]()
comptime a_layout = row_major[SIZE, SIZE]()
comptime OutLayout = type_of(out_layout)
comptime ALayout = type_of(a_layout)


def add_10_blocks_2d(
    output: TileTensor[mut=True, dtype, OutLayout, MutAnyOrigin],
    a: TileTensor[mut=False, dtype, ALayout, ImmutAnyOrigin],
    size: Int,
):
    var row = block_dim.y * block_idx.y + thread_idx.y
    var col = block_dim.x * block_idx.x + thread_idx.x
    # FILL ME IN (roughly 2 lines)


์ „์ฒด ์ฝ”๋“œ ๋ณด๊ธฐ: problems/p07/p07.mojo

ํŒ
  1. ์ „์—ญ ์ธ๋ฑ์Šค ๊ณ„์‚ฐ: row = block_dim.y * block_idx.y + thread_idx.y, col = block_dim.x * block_idx.x + thread_idx.x
  2. ๊ฐ€๋“œ ์ถ”๊ฐ€: if row < size and col < size
  3. ๊ฐ€๋“œ ๋‚ด๋ถ€: 2D TileTensor์— 10์„ ๋”ํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ์ƒ๊ฐํ•ด ๋ณด์„ธ์š”

์ฝ”๋“œ ์‹คํ–‰

์†”๋ฃจ์…˜์„ ํ…Œ์ŠคํŠธํ•˜๋ ค๋ฉด ํ„ฐ๋ฏธ๋„์—์„œ ๋‹ค์Œ ๋ช…๋ น์–ด๋ฅผ ์‹คํ–‰ํ•˜์„ธ์š”:

pixi run p07
pixi run -e amd p07
pixi run -e apple p07
uv run poe p07

ํผ์ฆ์„ ์•„์ง ํ’€์ง€ ์•Š์•˜๋‹ค๋ฉด ์ถœ๋ ฅ์ด ๋‹ค์Œ๊ณผ ๊ฐ™์ด ๋‚˜ํƒ€๋‚ฉ๋‹ˆ๋‹ค:

out: HostBuffer([0.0, 0.0, 0.0, ... , 0.0])
expected: HostBuffer([10.0, 11.0, 12.0, ... , 34.0])

์†”๋ฃจ์…˜

def add_10_blocks_2d(
    output: TileTensor[mut=True, dtype, OutLayout, MutAnyOrigin],
    a: TileTensor[mut=False, dtype, ALayout, ImmutAnyOrigin],
    size: Int,
):
    var row = block_dim.y * block_idx.y + thread_idx.y
    var col = block_dim.x * block_idx.x + thread_idx.x
    if row < size and col < size:
        output[row, col] = a[row, col] + 10.0


TileTensor๊ฐ€ 2D ๋ธ”๋ก ๊ธฐ๋ฐ˜ ์ฒ˜๋ฆฌ๋ฅผ ์–ผ๋งˆ๋‚˜ ๊ฐ„์†Œํ™”ํ•˜๋Š”์ง€ ๋ณด์—ฌ์ฃผ๋Š” ์†”๋ฃจ์…˜์ž…๋‹ˆ๋‹ค:

  1. 2D ์Šค๋ ˆ๋“œ ์ธ๋ฑ์‹ฑ

    • ์ „์—ญ ํ–‰(row): block_dim.y * block_idx.y + thread_idx.y

    • ์ „์—ญ ์—ด(col): block_dim.x * block_idx.x + thread_idx.x

    • ์Šค๋ ˆ๋“œ ๊ทธ๋ฆฌ๋“œ๋ฅผ ํ…์„œ ์›์†Œ์— ๋งคํ•‘:

      3ร—3 ๋ธ”๋ก์œผ๋กœ ๊ตฌ์„ฑ๋œ 5ร—5 ํ…์„œ:
      
      Block (0,0)         Block (1,0)
      [(0,0) (0,1) (0,2)] [(0,3) (0,4)    *  ]
      [(1,0) (1,1) (1,2)] [(1,3) (1,4)    *  ]
      [(2,0) (2,1) (2,2)] [(2,3) (2,4)    *  ]
      
      Block (0,1)         Block (1,1)
      [(3,0) (3,1) (3,2)] [(3,3) (3,4)    *  ]
      [(4,0) (4,1) (4,2)] [(4,3) (4,4)    *  ]
      [  *     *     *  ] [  *     *      *  ]
      

      (* = ์Šค๋ ˆ๋“œ๋Š” ์กด์žฌํ•˜์ง€๋งŒ ํ…์„œ ๊ฒฝ๊ณ„ ๋ฐ–)

  2. TileTensor์˜ ์žฅ์ 

    • ์ž์—ฐ์Šค๋Ÿฌ์šด 2D ์ธ๋ฑ์‹ฑ: ์ˆ˜๋™ ์˜คํ”„์…‹ ๊ณ„์‚ฐ ๋Œ€์‹  tensor[row, col] ์‚ฌ์šฉ

    • ์ž๋™ ๋ฉ”๋ชจ๋ฆฌ ๋ ˆ์ด์•„์›ƒ ์ตœ์ ํ™”

    • ์ ‘๊ทผ ํŒจํ„ด ์˜ˆ์‹œ:

      ์›์‹œ ๋ฉ”๋ชจ๋ฆฌ:          TileTensor:
      row * size + col    tensor[row, col]
      (2,1) -> 11        (2,1) -> ๊ฐ™์€ ์›์†Œ
      
  3. ๊ฒฝ๊ณ„ ๊ฒ€์‚ฌ

    • ๊ฐ€๋“œ row < size and col < size๊ฐ€ ์ฒ˜๋ฆฌํ•˜๋Š” ์ƒํ™ฉ:
      • ๋ถ€๋ถ„ ๋ธ”๋ก์—์„œ ๋ฒ”์œ„๋ฅผ ๋ฒ—์–ด๋‚˜๋Š” ์Šค๋ ˆ๋“œ
      • ํ…์„œ ๊ฒฝ๊ณ„์˜ ์—ฃ์ง€ ์ผ€์ด์Šค
      • ๋ฉ”๋ชจ๋ฆฌ ๋ ˆ์ด์•„์›ƒ์€ TileTensor๊ฐ€ ์ž๋™์œผ๋กœ ์ฒ˜๋ฆฌ
      • 25๊ฐœ ์›์†Œ๋ฅผ 36๊ฐœ ์Šค๋ ˆ๋“œ๋กœ ์ฒ˜๋ฆฌ (3ร—3 ๋ธ”๋ก์˜ 2ร—2 ๊ทธ๋ฆฌ๋“œ)
  4. ๋ธ”๋ก ์กฐ์œจ

    • ๊ฐ 3ร—3 ๋ธ”๋ก์ด 5ร—5 ํ…์„œ์˜ ์ผ๋ถ€๋ถ„์„ ๋‹ด๋‹น
    • TileTensor๊ฐ€ ์ฒ˜๋ฆฌํ•˜๋Š” ๋ถ€๋ถ„:
      • ๋ฉ”๋ชจ๋ฆฌ ๋ ˆ์ด์•„์›ƒ ์ตœ์ ํ™”
      • ํšจ์œจ์ ์ธ ์ ‘๊ทผ ํŒจํ„ด
      • ๋ธ”๋ก ๊ฒฝ๊ณ„ ๊ฐ„ ์กฐ์œจ
      • ์บ์‹œ ์นœํ™”์  ๋ฐ์ดํ„ฐ ์ ‘๊ทผ

์ด ํŒจํ„ด์€ TileTensor๊ฐ€ ์ตœ์ ์˜ ๋ฉ”๋ชจ๋ฆฌ ์ ‘๊ทผ ํŒจํ„ด๊ณผ ์Šค๋ ˆ๋“œ ์กฐ์œจ์„ ์œ ์ง€ํ•˜๋ฉด์„œ๋„ 2D ๋ธ”๋ก ์ฒ˜๋ฆฌ๋ฅผ ์–ผ๋งˆ๋‚˜ ๊ฐ„์†Œํ™”ํ•˜๋Š”์ง€ ๋ณด์—ฌ์ค๋‹ˆ๋‹ค.