* add checkpoint annotation for checkpointing memory optimization * add alpha-equivalence checkpoint test and fix gradient type issue * fix build issues * ignore checkpoint annotation when checking missing gradients * refactor, fix checkpoint compute for tuple and add tests