์™œ LayoutTensor๋ฅผ ๊ณ ๋ คํ•ด์•ผ ํ• ๊นŒ์š”?

์•„๋ž˜ ๊ธฐ์กด ๊ตฌํ˜„์„ ๋ณด๋ฉด ๋ช‡ ๊ฐ€์ง€ ์ž ์žฌ์ ์ธ ๋ฌธ์ œ๋ฅผ ๋ฐœ๊ฒฌํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค:

ํ˜„์žฌ ๋ฐฉ์‹

i = thread_idx.x
output[i] = a[i] + 10.0

1D ๋ฐฐ์—ด์—์„œ๋Š” ์ž˜ ์ž‘๋™ํ•˜์ง€๋งŒ, ๋‹ค์Œ๊ณผ ๊ฐ™์€ ์ƒํ™ฉ์—์„œ๋Š” ์–ด๋–จ๊นŒ์š”?

  • 2D๋‚˜ 3D ๋ฐ์ดํ„ฐ๋ฅผ ๋‹ค๋ค„์•ผ ํ•  ๋•Œ
  • ๋‹ค์–‘ํ•œ ๋ฉ”๋ชจ๋ฆฌ ๋ ˆ์ด์•„์›ƒ์„ ์ฒ˜๋ฆฌํ•ด์•ผ ํ•  ๋•Œ
  • ๋ณ‘ํ•ฉ(coalesced) ๋ฉ”๋ชจ๋ฆฌ ์ ‘๊ทผ์„ ๋ณด์žฅํ•ด์•ผ ํ•  ๋•Œ

์•ž์œผ๋กœ์˜ ๋„์ „ ๋ฏธ๋ฆฌ๋ณด๊ธฐ

ํผ์ฆ์„ ์ง„ํ–‰ํ•˜๋ฉด์„œ ๋ฐฐ์—ด ์ธ๋ฑ์‹ฑ์€ ์ ์  ๋ณต์žกํ•ด์ง‘๋‹ˆ๋‹ค:

# ์ดํ›„ ํผ์ฆ์—์„œ ๋‹ค๋ฃฐ 2D ์ธ๋ฑ์‹ฑ
idx = row * WIDTH + col

# 3D ์ธ๋ฑ์‹ฑ
idx = (batch * HEIGHT + row) * WIDTH + col

# ํŒจ๋”ฉ์ด ์žˆ๋Š” ๊ฒฝ์šฐ
idx = (batch * padded_height + row) * padded_width + col

LayoutTensor ๋ฏธ๋ฆฌ๋ณด๊ธฐ

LayoutTensor๋ฅผ ์‚ฌ์šฉํ•˜๋ฉด ์ด๋Ÿฐ ๊ฒฝ์šฐ๋ฅผ ํ›จ์”ฌ ๊น”๋”ํ•˜๊ฒŒ ์ฒ˜๋ฆฌํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค:

# ๋ฏธ๋ฆฌ๋ณด๊ธฐ - ์ง€๊ธˆ์€ ์ด ๋ฌธ๋ฒ•์„ ๋ชฐ๋ผ๋„ ๊ดœ์ฐฎ์Šต๋‹ˆ๋‹ค!
output[i, j] = a[i, j] + 10.0  # 2D ์ธ๋ฑ์‹ฑ
output[b, i, j] = a[b, i, j] + 10.0  # 3D ์ธ๋ฑ์‹ฑ

Puzzle 4์—์„œ LayoutTensor๋ฅผ ์ž์„ธํžˆ ๋ฐฐ์šธ ์˜ˆ์ •์ž…๋‹ˆ๋‹ค. ๊ทธ๋•Œ ์ด ๊ฐœ๋…๋“ค์ด ํ•„์ˆ˜๊ฐ€ ๋ฉ๋‹ˆ๋‹ค. ์ง€๊ธˆ์€ ๋‹ค์Œ ๋‚ด์šฉ์„ ์ดํ•ดํ•˜๋Š” ๋ฐ ์ง‘์ค‘ํ•˜์„ธ์š”:

  • ๊ธฐ๋ณธ ์Šค๋ ˆ๋“œ ์ธ๋ฑ์‹ฑ
  • ๊ฐ„๋‹จํ•œ ๋ฉ”๋ชจ๋ฆฌ ์ ‘๊ทผ ํŒจํ„ด
  • ์Šค๋ ˆ๋“œ์™€ ๋ฐ์ดํ„ฐ์˜ ์ผ๋Œ€์ผ ๋งคํ•‘

๐Ÿ’ก ํ•ต์‹ฌ ํฌ์ธํŠธ: ์ง์ ‘ ์ธ๋ฑ์‹ฑ์€ ๊ฐ„๋‹จํ•œ ๊ฒฝ์šฐ์— ์ž˜ ์ž‘๋™ํ•˜์ง€๋งŒ, ๋ณต์žกํ•œ GPU ํ”„๋กœ๊ทธ๋ž˜๋ฐ ํŒจํ„ด์—์„œ๋Š” ๊ณง ๋” ์ •๊ตํ•œ ๋„๊ตฌ๊ฐ€ ํ•„์š”ํ•ด์ง‘๋‹ˆ๋‹ค.