ν΅μ¬ κ°λ
μ΄ νΌμ¦μμ λ°°μ°λ λ΄μ©:
-
κΈ°λ³Έ GPU 컀λ ꡬ쑰
-
thread_idx.xλ₯Ό μ¬μ©ν μ€λ λ μΈλ±μ± -
κ°λ¨ν λ³λ ¬ μ°μ°
-
λ³λ ¬μ±: κ° μ€λ λκ° λ 립μ μΌλ‘ μ€νλ©λλ€
-
μ€λ λ μΈλ±μ±:
i = thread_idx.xμμΉμ μμμ μ κ·Όν©λλ€ -
λ©λͺ¨λ¦¬ μ κ·Ό:
a[i]μμ μ½κ³output[i]μ μλλ€ -
λ°μ΄ν° λ 립μ±: κ° μΆλ ₯μ ν΄λΉ μ λ ₯μλ§ μμ‘΄ν©λλ€
μμ±ν μ½λ
comptime SIZE = 4
comptime BLOCKS_PER_GRID = 1
comptime THREADS_PER_BLOCK = SIZE
comptime dtype = DType.float32
fn add_10(
output: UnsafePointer[Scalar[dtype], MutAnyOrigin],
a: UnsafePointer[Scalar[dtype], MutAnyOrigin],
):
i = thread_idx.x
# FILL ME IN (roughly 1 line)
μ 체 μ½λ 보기: problems/p01/p01.mojo
ν
thread_idx.xλ₯Όiμ μ μ₯ν©λλ€a[i]μ 10μ λν©λλ€- κ²°κ³Όλ₯Ό
output[i]μ μ μ₯ν©λλ€
μ½λ μ€ν
μ루μ μ ν μ€νΈνλ €λ©΄ ν°λ―Έλμμ λ€μ λͺ λ Ήμ΄λ₯Ό μ€ννμΈμ:
pixi run p01
pixi run -e amd p01
pixi run -e apple p01
uv run poe p01
νΌμ¦μ μμ§ νμ§ μμλ€λ©΄ μΆλ ₯μ΄ λ€μκ³Ό κ°μ΄ λνλ©λλ€:
out: HostBuffer([0.0, 0.0, 0.0, 0.0])
expected: HostBuffer([10.0, 11.0, 12.0, 13.0])
μ루μ
fn add_10(
output: UnsafePointer[Scalar[dtype], MutAnyOrigin],
a: UnsafePointer[Scalar[dtype], MutAnyOrigin],
):
i = thread_idx.x
output[i] = a[i] + 10.0
μ΄ μ루μ μ:
i = thread_idx.xλ‘ μ€λ λ μΈλ±μ€λ₯Ό κ°μ Έμ΅λλ€- μ
λ ₯κ°μ 10μ λν©λλ€:
output[i] = a[i] + 10.0