Ensuring privacy during inference stage is crucial to prevent malicious third parties from reconstructing users' private inputs from outputs of public models. Despite a large body of literature on privacy preserving learning (which ensures privacy of training data), there is no existing systematic framework to ensure the privacy of users' data during inference. Motivated by this problem, we introduce the notion of Inference Privacy (IP), which can allow a user to interact with a model (for instance, a classifier, or an AI-assisted chat-bot) while providing a rigorous privacy guarantee for the users' data at inference. We establish fundamental properties of the IP privacy notion and also contrast it with the notion of Local Differential Privacy (LDP). We then present two types of mechanisms for achieving IP: namely, input perturbations and output perturbations which are customizable by the users and can allow them to navigate the trade-off between utility and privacy. We also demonstrate the usefulness of our framework via experiments and highlight the resulting trade-offs between utility and privacy during inference.