Yes. I’ve implemented a lot of new stuffs recently to make training more stable (allowing also to train deeper models, but that’s not a very deep model here).
Architecture details:
* List of modules:
- Module: IN (type: input)
* Output size: 32,32,1,1
- Module: IN_a (type: clone)
* Input: IN (32,32,1,1)
* Output size: 32,32,1,1
- Module: IN_b (type: clone)
* Input: IN (32,32,1,1)
* Output size: 32,32,1,1
- Module: FM_conv2d (type: conv2d)
* Input: IN_a (32,32,1,1)
* Output size: 32,32,1,32
* Properties: kernel=3x3, stride=1, dilation=1, border_shrink=0, boundary_conditions=neumann, learning_mode=3, regularization=0
* Parameters: 320
- Module: FM (type: rename)
* Input: FM_conv2d (32,32,1,32)
* Output size: 32,32,1,32
- Module: FM_a (type: clone)
* Input: FM (32,32,1,32)
* Output size: 32,32,1,32
- Module: FM_b (type: clone)
* Input: FM (32,32,1,32)
* Output size: 32,32,1,32
- Module: C1_1_conv2d (type: conv2d)
* Input: FM_a (32,32,1,32)
* Output size: 32,32,1,32
* Properties: kernel=3x3, stride=1, dilation=1, border_shrink=0, boundary_conditions=neumann, learning_mode=3, regularization=0
* Parameters: 9248
- Module: C1_1 (type: nl)
* Input: C1_1_conv2d (32,32,1,32)
* Output size: 32,32,1,32
* Property: activation=leakyrelu
- Module: C1_2_conv2d (type: conv2d)
* Input: C1_1 (32,32,1,32)
* Output size: 32,32,1,32
* Properties: kernel=3x3, stride=1, dilation=1, border_shrink=0, boundary_conditions=neumann, learning_mode=3, regularization=0
* Parameters: 9248
- Module: C1_2 (type: rename)
* Input: C1_2_conv2d (32,32,1,32)
* Output size: 32,32,1,32
- Module: pB1 (type: add)
* Inputs: FM_b,C1_2 (32,32,1,32 and 32,32,1,32)
* Output size: 32,32,1,32
- Module: B1 (type: nl)
* Input: pB1 (32,32,1,32)
* Output size: 32,32,1,32
* Property: activation=leakyrelu
- Module: B1_a (type: clone)
* Input: B1 (32,32,1,32)
* Output size: 32,32,1,32
- Module: B1_b (type: clone)
* Input: B1 (32,32,1,32)
* Output size: 32,32,1,32
- Module: C2_1_conv2d (type: conv2d)
* Input: B1_a (32,32,1,32)
* Output size: 32,32,1,32
* Properties: kernel=3x3, stride=1, dilation=1, border_shrink=0, boundary_conditions=neumann, learning_mode=3, regularization=0
* Parameters: 9248
- Module: C2_1 (type: nl)
* Input: C2_1_conv2d (32,32,1,32)
* Output size: 32,32,1,32
* Property: activation=leakyrelu
- Module: C2_2_conv2d (type: conv2d)
* Input: C2_1 (32,32,1,32)
* Output size: 32,32,1,32
* Properties: kernel=3x3, stride=1, dilation=1, border_shrink=0, boundary_conditions=neumann, learning_mode=3, regularization=0
* Parameters: 9248
- Module: C2_2 (type: rename)
* Input: C2_2_conv2d (32,32,1,32)
* Output size: 32,32,1,32
- Module: pB2 (type: add)
* Inputs: B1_b,C2_2 (32,32,1,32 and 32,32,1,32)
* Output size: 32,32,1,32
- Module: B2 (type: nl)
* Input: pB2 (32,32,1,32)
* Output size: 32,32,1,32
* Property: activation=leakyrelu
- Module: UP_conv2d (type: conv2d)
* Input: B2 (32,32,1,32)
* Output size: 64,64,1,64
* Properties: kernel=3x3, stride=0.5, dilation=1, border_shrink=0, boundary_conditions=neumann, learning_mode=3, regularization=0
* Parameters: 18496
- Module: UP (type: rename)
* Input: UP_conv2d (64,64,1,64)
* Output size: 64,64,1,64
- Module: UP_a (type: clone)
* Input: UP (64,64,1,64)
* Output size: 64,64,1,64
- Module: UP_b (type: clone)
* Input: UP (64,64,1,64)
* Output size: 64,64,1,64
- Module: C3_1_conv2d (type: conv2d)
* Input: UP_a (64,64,1,64)
* Output size: 64,64,1,64
* Properties: kernel=3x3, stride=1, dilation=1, border_shrink=0, boundary_conditions=neumann, learning_mode=3, regularization=0
* Parameters: 36928
- Module: C3_1 (type: nl)
* Input: C3_1_conv2d (64,64,1,64)
* Output size: 64,64,1,64
* Property: activation=leakyrelu
- Module: C3_2_conv2d (type: conv2d)
* Input: C3_1 (64,64,1,64)
* Output size: 64,64,1,64
* Properties: kernel=3x3, stride=1, dilation=1, border_shrink=0, boundary_conditions=neumann, learning_mode=3, regularization=0
* Parameters: 36928
- Module: C3_2 (type: rename)
* Input: C3_2_conv2d (64,64,1,64)
* Output size: 64,64,1,64
- Module: pB3 (type: add)
* Inputs: UP_b,C3_2 (64,64,1,64 and 64,64,1,64)
* Output size: 64,64,1,64
- Module: B3 (type: nl)
* Input: pB3 (64,64,1,64)
* Output size: 64,64,1,64
* Property: activation=leakyrelu
- Module: B3_a (type: clone)
* Input: B3 (64,64,1,64)
* Output size: 64,64,1,64
- Module: B3_b (type: clone)
* Input: B3 (64,64,1,64)
* Output size: 64,64,1,64
- Module: C4_1_conv2d (type: conv2d)
* Input: B3_a (64,64,1,64)
* Output size: 64,64,1,64
* Properties: kernel=3x3, stride=1, dilation=1, border_shrink=0, boundary_conditions=neumann, learning_mode=3, regularization=0
* Parameters: 36928
- Module: C4_1 (type: nl)
* Input: C4_1_conv2d (64,64,1,64)
* Output size: 64,64,1,64
* Property: activation=leakyrelu
- Module: C4_2_conv2d (type: conv2d)
* Input: C4_1 (64,64,1,64)
* Output size: 64,64,1,64
* Properties: kernel=3x3, stride=1, dilation=1, border_shrink=0, boundary_conditions=neumann, learning_mode=3, regularization=0
* Parameters: 36928
- Module: C4_2 (type: rename)
* Input: C4_2_conv2d (64,64,1,64)
* Output size: 64,64,1,64
- Module: pB4 (type: add)
* Inputs: B3_b,C4_2 (64,64,1,64 and 64,64,1,64)
* Output size: 64,64,1,64
- Module: B4 (type: nl)
* Input: pB4 (64,64,1,64)
* Output size: 64,64,1,64
* Property: activation=leakyrelu
- Module: RESIDUAL_conv2d (type: conv2d)
* Input: B4 (64,64,1,64)
* Output size: 64,64,1,1
* Properties: kernel=3x3, stride=1, dilation=1, border_shrink=0, boundary_conditions=neumann, learning_mode=3, regularization=0
* Parameters: 577
- Module: RESIDUAL (type: rename)
* Input: RESIDUAL_conv2d (64,64,1,1)
* Output size: 64,64,1,1
- Module: upIN (type: resize)
* Input: IN_b (32,32,1,1)
* Output size: 64,64,1,1
* Property: interpolation=3
- Module: OUT (type: add)
* Inputs: upIN,RESIDUAL (64,64,1,1 and 64,64,1,1)
* Output size: 64,64,1,1
* Total: 43 modules, 204097 parameters.
No batch norm here, because I use residual blocks which allows the gradients to be backpropagated pretty much right. I do use gradient clipping though, even if I didn’t test if this was absolutely necessary or not. And I actually use double
for the computation during the training, even if I store the resulting weights in float
after each iteration.
So really nothing very advanced, with my library I can’t do very complicated stuff anyway