If you mux a 120 fps AVI with MP4Box, I think it'll drop the null frames and leave you with a vfr MP4.
Negative. You must process the video through TDecimate() or DeDup() to produce the VFRaC file. It will also spit back a timecodes.txt file for use when muxing to MKV. MP4 cannot really contain VFR content. The way MP4 contains it is in sections of different framerates concatenated into one file. For instance, the first minute will be 24p, then the next is 30i, then back to 24p, and oh hell while were on it, then up to 60p, and then back down to 24p.
DeDup() is pretty nifty because if you set the threshold just right, it will discard the second frame in a two-frame pattern and will increase your encoding efficiency tremendously. Imagine a still frame of a character just staring at the camera for a half second, then begins to walk away. DeDup() will only give you the first frame of the stare before moving onto the walk. With the timecode file, the player is instructed to repeat that first frame 12 times before it shows the next frame of the walk.