* Improve non_max_suppression for CPU * Improve get_valid_counts * Minor change * Skip some unnecessary computes