You can use the sigmoid function... But you should remember that it's output is asymptotic... The sigmoid function "goes to 0" at minus infinity, and "goes to 1" at infinity, but never actually reaches those values. Training your network to reach them can result in very large weights, which does not help convergence.
If you use the sigmoid function, you should probably train your network to output values in a restricted range like [0.25, 0.75], the lowest representing 0, and then interpret outputs under 0.5 as being 0, and over 0.5 as 1.