DataFlow ๋ฒ„ํ‹ฐ์Šค(vertices)

๐Ÿฆ‰ https://github.com/MikeInnes/DataFlow.jl/blob/master/docs/vertices.md ๋ฒˆ์—ญ

DataFlow๊ฐ€ ํ•˜๋Š” ๋‘๊ฐ€์ง€:

์„œ๋กœ ์–ถ๋งค์—ฌ ์žˆ์ง€ ์•Š์œผ๋‹ˆ ํŽธํ•˜๊ฒŒ ์“ฐ๋ฉด ๋œ๋‹ค; ์˜ˆ๋ฅผ ๋“ค์–ด, ์ด ๋ฌธ๋ฒ•์œผ๋กœ ๋งŒ๋“  ๊ทธ๋ž˜ํ”„๋ฅผ ์ธ์ ‘ ํ–‰๋ ฌ(an adjacency matrix)๋กœ ๋ณ€ํ™˜ํ•˜์—ฌ ์ฒ˜๋ฆฌํ•œ๋‹ค๊ฑฐ๋‚˜ DataFlow์˜ ๊ณตํ†ต ๊ทธ๋ž˜ํ”„ ์—ฐ์‚ฐ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋ฅผ ํ™œ์šฉํ•˜์—ฌ ๋‹ค๋ฅธ ๋ฐฉ์‹์œผ๋กœ ๊ทธ๋ž˜ํ”„๋ฅผ ์ƒ์„ฑํ•  ์ˆ˜ ์žˆ๋‹ค.

DataFlow๋Š” ๋ช…์‹œ์ ์œผ๋กœ ๋ฐ์ดํ„ฐ ๊ตฌ์กฐ๋ฅผ ๋‹จ์ˆœํ•˜๊ฒŒ ์œ ์ง€ํ•˜๊ณ , ๋‹ค๋ฅธ ์–ด๋– ํ•œ ์˜๋ฏธ๋„ ๋ง๋ถ™์ด์ง€ ์•Š๋Š”๋‹ค. ๊ทธ๋ž˜ํ”„๋Š” ๋ฐ˜๋“ฏํ•œ ์ค„๋ฆฌ์•„ ํ”„๋กœ๊ทธ๋žจ, ๋ฒ ์ด์‹œ์•ˆ ๋„คํŠธ์›Œํฌ, ๋˜๋Š” ์ „๊ธฐ ํšŒ๋กœ์™€ ๊ฐ™์€ ๊ฒƒ์„ ํ‘œํ˜„ํ•  ์ˆ˜ ์žˆ๋‹ค. (Flux์™€ ๊ฐ™์€) DataFlow๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋Š” ๋ฌธ๋ฒ•์„ ํ™•์žฅํ•  ์ˆ˜ ์žˆ๊ธฐ๋ฅผ ์›ํ•˜๋ฉฐ ์‘์šฉ ํ”„๋กœ๊ทธ๋žจ์— ์ ์ ˆํ•œ ์ฝ”๋“œ๋ฅผ ์ƒ์„ฑ์‹œํ‚ค๋Š” ๊ทธ๋ž˜ํ”„๋ฅผ ๋‹ค๋ฃฐ ๊ฒƒ์ด๋‹ค.

๋ฐ์ดํ„ฐ ๊ตฌ์กฐ

DataFlow๋Š” DVertex์™€ IVertex๋ผ๋Š” ๋‘ ๊ฐœ์˜ ๋ฐ์ดํ„ฐ ๊ตฌ์กฐ๊ฐ€ ์žˆ๋‹ค. ๋‘˜์€ ๊ทธ๋ž˜ํ”„์˜ ๋…ธ๋“œ(nodes)๊ฐ€ ๋‹ค๋ฅธ ๋…ธ๋“œ์™€ ์ž…๋ ฅ/์ถœ๋ ฅ(inputs/outputs)์ด ์–ด๋””์—/์–ด๋””์—์„œ(to/from) ์ด๋ค„์งˆ ๊ฒƒ์ธ์ง€ ํ‘œํ˜„ํ•œ๋‹ค. IVertex๋Š” ๋งํฌ๋“œ ๋ฆฌ์ŠคํŠธ(a linked list) ์ฒ˜๋Ÿผ ์ž…๋ ฅ-์—ฐ๊ฒฐ(input-linked) ์ด๋‹ค - ์ž…๋ ฅ์œผ๋กœ ์‚ฌ์šฉ๋˜๋Š” ๋…ธ๋“œ์— ๋Œ€ํ•œ ์ฐธ์กฐ(a reference)๋ฅผ ์œ ์ง€ํ•œ๋‹ค. DVertex๋Š” ์ด์ค‘์œผ๋กœ ์—ฐ๊ฒฐํ•œ ๊ฒƒ์œผ๋กœ ๋”๋ธ”-๋งํฌ๋“œ ๋ฆฌ์ŠคํŠธ(doubly-linked list)์— ํ•ด๋‹นํ•œ๋‹ค - ์ž…๋ ฅ๊ณผ ์ด๊ฒƒ์„ ์ž…๋ ฅ์œผ๋กœ ๊ฐ–๋Š” ๋ชจ๋“  ๋…ธ๋“œ๋ฅผ ์ฐธ์กฐํ•œ๋‹ค. DVertex๋Š” ๊ธฐ์ˆ ์ ์œผ๋ก  ํ‘œํ˜„๋ ฅ์ด ๋” ์ข‹์ง€๋งŒ ์ž‘์—…ํ•˜๊ธฐ๋„ ๋” ๋นก์„ธ๋‹ˆ๊นŒ, ๊ฐ€๋Šฅํ•˜๋‹ค๋ฉด ์ž…๋ ฅ-์—ฐ๊ฒฐ(input-linked)๋กœ ๋ฐ”๊ฟ”์„œ ์“ฐ๋Š”๊ฒŒ ์ตœ์„ ์ด๋‹ค (DataFlow.il()๋กœ ํ•  ์ˆ˜ ์žˆ๋‹ค).

# src/graph/graph.jl
abstract type Vertex{T} end

# src/graph/ilgraph.jl
struct IVertex{T} <: Vertex{T}
  value::T
  inputs::Vector{IVertex{T}}
  # outputs::Set{IVertex{T}} # DVertex๋Š” ์š”๊ฑธ ์ถ”๊ฐ€
end

IVertex๋Š” ์ค„๋ฆฌ์•„์˜ Expr ๊ฐ์ฒด์™€ ์œ ์‚ฌํ•˜๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด, ๋‹ค์Œ๊ณผ ๊ฐ™์ด ํ‘œํ˜„์‹ x+length(xs) ๋ฅผ ๋น„์Šทํ•œ ๋ฐฉ์‹์œผ๋กœ์„œ ์ €์žฅํ•œ๋‹ค.

julia> using DataFlow

julia> x = 2
2

julia> Expr(:call, :+, x, Expr(:call, :length, :xs))
:(2 + length(xs))

julia> IVertex(:+, IVertex(:x), IVertex(:length, IVertex(:xs)))
IVertex{Symbol}(x() + length(xs()))

์ฃผ์š”ํ•œ ์ฐจ์ด๋Š” ๊ฐ์ฒด ์•„์ด๋ดํ‹ฐํ‹ฐ(object identity)๊ฐ€ DataFlow ๊ทธ๋ž˜ํ”„์—์„œ๋Š” ์ค‘์š”ํ•˜๋‹ค๋Š” ๊ฒƒ์ด๋‹ค. ๋‹ค์Œ๊ณผ ๊ฐ™์ด ๊ตฌ๋ฌธ ํ‘œํ˜„ ํŠธ๋ฆฌ(an expression tree)๋ฅผ ๋งŒ๋“ค๋ฉด:

julia> foo = Expr(:call, :length, :xs)
:(length(xs))

julia> Expr(:call, :+, foo, foo)
:(length(xs) + length(xs))

length(xs) ํ‘œํ˜„์‹์„ ์žฌ์‚ฌ์šฉ ํ–ˆ์Œ์—๋„ length(xs)+length(xs)๋ฅผ ์ถœ๋ ฅํ•œ๋‹ค. DataFlow์˜ ์žฌ์‚ฌ์šฉ์€ ํฐ ์ฐจ์ด๊ฐ€ ์žˆ๋‹ค:

julia> g = IVertex{Any}
IVertex

julia> g(:+, g(:length, g(:xs)), g(:length, g(:xs)))
IVertex(length(xs()) + length(xs()))

julia> foo = g(:length, g(:xs))
IVertex(length(xs()))

julia> g(:+, foo, foo)
IVertex(
eland = length(xs())
eland + eland)

์žฌ์‚ฌ์šฉ ํ•œ ๊ฒƒ์ด ํ”„๋กœ๊ทธ๋žจ ๊ทธ๋ž˜ํ”„์— ์ธ์ฝ”๋“œ ๋˜์—ˆ๋‹ค. ์œ„์˜ ๋ฐ์ดํ„ฐ ๊ตฌ์กฐ์—์„œ๋Š” "๋ณ€์ˆ˜" ๊ฐœ๋…์ด ์—†๋Š”๋ฐ ๋ฐ์ดํ„ฐ์˜ ํ๋ฆ„์ด ์ง์ ‘ ํ‘œํ˜„๋˜์—ˆ๊ธฐ ๋•Œ๋ฌธ์ด๋‹ค; ๋Œ€์‹  ๋ฌธ๋ฒ• ๋ณ€ํ™˜์—์„œ ๋ณ€์ˆ˜๊ฐ€ ํ•„์š”ํ•ด์ง€๋ฉด ๊ทธ ๋•Œ ์ƒ์„ฑ๋  ๊ฒƒ์ด๋‹ค.

์•Œ๊ณ ๋ฆฌ์ฆ˜

DataFlow ๊ทธ๋ž˜ํ”„๋ฅผ ๋‹ค๋ฃจ๋Š” ๊ธฐ๋ณธ ์ ‘๊ทผ ๋ฐฉ์‹์€ ํ•จ์ˆ˜ํ˜• ํ”„๋กœ๊ทธ๋ž˜๋ฐ์—์„œ ํŠธ๋ฆฌ(tree)๋ฅผ ๋‹ค๋ฃจ๋Š” ํ…Œํฌ๋‹‰๊ณผ ๊ฐ™์€ ๊ฒƒ์„ ์‚ฌ์šฉํ•œ๋‹ค. ๊ทธ๋Ÿฌ๋‹ˆ๊นŒ, ์žฌ๊ท€์ ์œผ๋กœ(recursively) ์ด์ „ ๊ฒƒ์„ ๋ฐŸ์•„๋‚˜๊ฐ€๋ฉฐ ์ƒˆ๋กœ์šด ๊ทธ๋ž˜ํ”„๋ฅผ ์ƒ์„ฑํ•˜๋Š” ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ๋งŒ๋“ค๋„๋ก ํ•˜์ž. ๊ทธ๋ž˜ํ”„์— ์žˆ๋Š” ๊ฐ ๋…ธ๋“œ์— ํŠน์ • ํ•จ์ˆ˜๋ฅผ ์ ์šฉ(apply) ์‹œํ‚ค๋Š” ํ•จ์ˆ˜๋กœ์„œ, prewalk ์™€ postwalk ๊ฐ™์€๊ฒŒ ํŒจํ‚ค์ง€์— ๋“ค์–ด ์žˆ๋‹ค.

์˜ˆ๋ฅผ ๋ณด์ž:

julia> using DataFlow: postwalk, value

julia> foo = g(:+, g(:length, g(:xs)), g(:length, g(:ys)))
IVertex(length(xs()) + length(ys()))

julia> postwalk(foo) do v
         value(v) == :length && value(v[1]) == :xs ? g(:lx) : v
       end
IVertex(lx() + length(ys()))

(pre- ์™€ postwalk์˜ ์ฐจ์ด๋Š” ์ˆœํšŒ(traversal)ํ•˜๋Š” ์ˆœ์„œ์— ์žˆ๋‹ค. @show ๋ฅผ ํ†ตํ•ด์„œ ๋ณผ ์ˆ˜ ์žˆ๋‹ค.) ์ด ๋ฐฉ๋ฒ•์œผ๋กœ ๊ทธ๋ž˜ํ”„์—์„œ ์ฐพ๊ธฐ(find), ๋ฐ”๊พธ๊ธฐ(replace) ๊ฐ™์€ ๊ฒƒ์„ ํ•˜๊ฑฐ๋‚˜, ๋”์šฑ ๋ณต์žกํ•œ ๊ตฌ์กฐ ๋ณ€ํ™˜์— ์ ์šฉํ•  ์ˆ˜ ์žˆ๋‹ค. ๊ทธ๋Ÿผ ์ด์ œ ๊ณตํ†ต ๋ถ€๋ถ„ ํ‘œํ˜„์‹ ์ œ๊ฑฐ(cse, common subexpression elimination)๋ฅผ ํ•˜๋Š” ๊ธฐ๋ณธ์ ์ธ ๋ฐฉ๋ฒ•์„ ์€์—ฐ ์ค‘์— ํ„ฐ๋“ํ–ˆ์œผ๋‹ˆ ํ•œ๋ฒˆ ๊ตฌํ˜„ํ•ด ๋ณด์ž:

julia> cse(v::IVertex, cache = Dict()) =
         postwalk(x -> get!(cache, x, x), v)
cse (generic function with 2 methods)

(์—ญ์ฃผ: get!์€ ์‚ฌ์ „(Dict)์— ์—†๋Š” ํ‚ค๋ฅผ ์ €์žฅํ•œ๋‹ค.)

julia> d = Dict("a"=>1, "b"=>2, "c"=>3);

julia> get!(d, "a", 5)
1

julia> get!(d, "d", 4)
4

julia> d
Dict{String,Int64} with 4 entries:
  "c" => 3
  "b" => 2
  "a" => 1
  "d" => 4

๊ทธ๋ž˜ํ”„์˜ ๊ฐ ๋…ธ๋“œ๊ฐ€ ์‚ฌ์ „(Dict) ํƒ€์ž…์— ๋Œ์–ด๋“ค์ด๊ณ  ๊ฐ’๋“ค์€ ์ž๊ธฐ ์ž์‹ ์„ ์ฐธ์กฐ(refer) ํ•œ๋‹ค. ์ด๊ฒƒ์œผ๋กœ ๊ฒฐ๊ณผ ๊ทธ๋ž˜ํ”„์˜ ์–ด๋Š ๊ฐ’์ด๋“  ==๋Š” ===์™€ ๋งˆ์ฐฌ๊ฐ€์ง€์ธ๊ฒŒ (===๋Š” identical ๋น„๊ต) ํ™•์‹คํ•ด์ง€๋ฉฐ, ๊ณตํ†ต ํ‘œํ˜„์‹์€ ์žฌ์‚ฌ์šฉ๋œ๋‹ค.

julia> foo = @flow length(xs)+length(xs)
IVertex(length(xs) + length(xs))

julia> cse(foo)
IVertex(
eland = length(xs)
eland + eland)

์ผ๋ฐ˜์ ์œผ๋กœ DataFlow์˜ postwalk์™€ ๊ฐ™์€ ๊ณ ๊ธ‰ ์—ฐ์‚ฐ์— ๋Šฅ์ˆ™ํ•ด์•ผ ํ•˜์ง€๋งŒ, ์–ด๋–ค ๊ฒฝ์šฐ์—๋Š” ์ฒ˜์Œ๋ถ€ํ„ฐ ์žฌ๊ท€ ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ์ง์ ‘ ์งœ์•ผ ํ•  ๋•Œ๋„ ์žˆ๋‹ค. ํŠธ๋ฆฌ(a tree)์— ์ ์šฉํ•˜๋Š” ์•Œ๊ณ ๋ฆฌ์ฆ˜๊ณผ ๊ฐ™์•„ ๋ณด์ด์ง€๋งŒ ์ฃผ์˜ ์‚ฌํ•ญ์ด ์žˆ๋Š”๋ฐ (1) ๋™์ผํ•œ ๋…ธ๋“œ(identical nodes)๊ฐ€ ํŠธ๋ฆฌ์—์„œ ์—ฌ๋Ÿฌ ๋ฒˆ ๋„๋‹ฌํ•  ์ˆ˜ ์žˆ๋‹ค. (2) ์žฌ๊ท€ํ•˜๋‹ค ๊ทธ๋ž˜ํ”„์—์„œ ์‚ฌ์ดํด(cyle)์ด ๋ฐœ์ƒํ•˜์—ฌ ๋ฌดํ•œ ๋ฃจํ”„์— ๋น ์งˆ ์ˆ˜ ์žˆ๋‹ค.

์ž˜๋ชปํ•˜๋ฉด ์•…๋ชฝ์ฒ˜๋Ÿผ ๋ณด์ด์ง€๋งŒ ์‚ฌ์‹ค์€ ์ผ์„์ด์กฐ์ธ ๊ฒƒ์ด; ํ•จ์ˆ˜๋ฅผ memoize ํ•˜์—ฌ ๋…ธ๋“œ๋ฅผ ๋ฐ˜๋ณตํ•ด์„œ ๋ฐฉ๋ฌธํ•˜๋ฉด ์žฌ๊ท€๋ฅผ ๋๋‚ด๊ฒŒ ํ•˜์ž. ๊ทธ๋ฆฌ๊ณ  ์žฌ๊ท€ํ•˜๊ธฐ ์ „์—๋Š” ํ˜„์žฌ ํ˜ธ์ถœ์˜ ๊ฒฐ๊ณผ๋ฅผ ๊ผญ ์บ์‹œ(cache) ํ•˜๋„๋ก ํ•œ๋‹ค.

julia> using DataFlow: value, inputs, thread!

julia> function replace_xs(g, cache = ObjectIdDict())
         # ์ด ๋…ธ๋“œ๋ฅผ ์ด๋ฏธ ์ฒ˜๋ฆฌํ•œ ๊ฒฝ์šฐ์—๋Š” ๋น ๋ฅธ ์ข…๋ฃŒ
         haskey(cache, g) && return cache[g]
         # ์ƒˆ๋กœ์šด (๋น„์–ด์žˆ๋Š”) ๋…ธ๋“œ๋ฅผ ๋งŒ๋“ค๊ณ  ์บ์‹œ ํ•ด ๋‘ 
         cache[g] = gโ€ฒ = typeof(g)(value(g) == :xs ? :foo : value(g))
         # ์›๋ž˜ ๋…ธ๋“œ์˜ ์ž…๋ ฅ์€ ์ฒ˜๋ฆฌํ•˜๊ณ  ๊ฒฐ๊ณผ๋ฅผ ์ƒˆ๋กœ์šด ๋…ธ๋“œ์— ์ถ”๊ฐ€
         thread!(gโ€ฒ, (replace_xs(v, cache) for v in inputs(g))...)
       end
replace_xs (generic function with 2 methods)

julia> foo = DataFlow.cse(@flow length(xs)+length(xs))
IVertex(
alligator = length(xs)
alligator + alligator)

julia> replace_xs(foo)
IVertex(
alligator = length(xs)
alligator + alligator)

์—ฌ๊ธฐ์„œ๋Š” ์บ์‹œํ•˜๋Š” ๊ฒƒ์„ ์žŠ์–ด๋„ length(foo)+length(foo) ํ•˜๋‹ค ๋งํ•˜์ง„ ์•Š๋Š”๋ฐ, ๋‹ค๋ฅธ ๊ฒฝ์šฐ์—๋Š” ๋„์ค‘์— ๋ฉˆ์ถœ ์ˆ˜ ์žˆ๋‹ค.

๐Ÿฆ‰ ๋ฒˆ์—ญ ์™„๋ฃŒ 2018-03-15