How Much Backtracking is Enough? Exploring the Interplay of SFT and RL in Enhancing LLM Reasoning Paper • 2505.24273 • Published May 30 • 5
VeriCoT: Neuro-symbolic Chain-of-Thought Validation via Logical Consistency Checks Paper • 2511.04662 • Published Nov 6 • 34
junlinw/opc-sft-s2-annealing-ins3-python-precode0.5-cb_AST_0.5_200.0_var3 Viewer • Updated Oct 6 • 2.24M • 33
junlinw/opc-sft-s2-annealing-ins3-python-precode0.5-cb_AST_0.0_200.0_var3 Viewer • Updated Oct 6 • 2.24M • 26
junlinw/opc-sft-s2-annealing-ins3-python-precode0.5-cb-og0.1entire_AST_0.5_200.0_var3 Viewer • Updated Oct 6 • 2.24M • 32
junlinw/opc-sft-s2-annealing-ins3-python-precode0.5-cb-og0.1entire_AST_0.0_200.0_var3 Viewer • Updated Oct 6 • 2.24M • 31
junlinw/opc-sft-s2-annealing-ins3-python-precode0.5-cb-og0.1entire_AST_1.0_200.0_var3 Viewer • Updated Oct 6 • 2.24M • 31
junlinw/opc-sft-s2-annealing-ins3-python-precode0.5-cb-og0.1entire_OG_var3 Viewer • Updated Oct 6 • 2.24M • 37
junlinw/opc-sft-s2-annealing-ins3-python-precode0.5-cb_AST_1.0_200.0_var3 Viewer • Updated Oct 6 • 2.24M • 35
junlinw/opc-sft-s2-annealing-ins3-python-precode0.5-cb_AST_0.5_200.0_var3 Viewer • Updated Oct 6 • 2.24M • 33
junlinw/opc-sft-s2-annealing-ins3-python-precode0.5-og0.1_AST_0.0_200.0_var3 Viewer • Updated Oct 5 • 2.24M • 31
junlinw/opc-sft-s2-annealing-ins3-python-precode0.5-og0.1_AST_0.0_200.0_var3 Viewer • Updated Oct 5 • 2.24M • 31
junlinw/opc-sft-s2-annealing-ins3-python-precode0.5-og0.1_AST_0.5_200.0_var3 Viewer • Updated Oct 5 • 2.24M • 32
junlinw/opc-sft-s2-annealing-ins3-python-precode0.5-og0.1_AST_0.5_200.0_var3 Viewer • Updated Oct 5 • 2.24M • 32
junlinw/opc-sft-s2-annealing-ins3-python-precode0.5-cb_AST_0.0_200.0_var3 Viewer • Updated Oct 6 • 2.24M • 26