This paper proposes a novel fitting procedure for Markov Modulated Poisson Processes (MMPPs), consisting of the superposition of N 2-MMPPs, that is capable of capturing the long-range characteristics of the traffic. The procedure matches both the autocovariance and marginal distribution functions of the rate process. We start by matching each 2-MMPP to a different component of the autocovariance function. We then map the parameters of the model with N individual 2-MMPPs (termed superposed MMPP) to the parameters of the equivalent MMPP with 2N states that results from the superposition of the N individual 2-MMPPs (termed generic MMPP). Finally, the parameters of the generic MMPP are fitted to the marginal distribution, subject to the constraints imposed by the autocovariance matching. Specifically, the matching of the distribution will be restricted by the fact that it may not be possible to decompose a generic MMPP back into individual 2-MMPPs. Overall, our procedure is motivated by the fact that direct relationships can be established between the autocovariance and the parameters of the superposed MMPP and between the marginal distribution and the parameters of the generic MMPP. We apply the fitting procedure to traffic traces exhibiting LRD including (i) IP traffic measured at our institution and (ii) IP traffic traces available in the Internet such as the well known, publicly available, Bellcore traces. The selected traces are representative of a wide range of services/protocols used in the Internet. We assess the fitting procedure by comparing the measured and fitted traces (traces generated from the fitted models) in terms of (i) Hurst parameter; (ii) degree of approximation between the autocovariance and marginal distribution curves; (iii) range of time scales where LRD is observed using a wavelet based estimator and (iv) packet loss ratio suffered in a single buffer for different values of the buffer capacity. Results are very clear in showing that MMPPs, when used in conjunction with the proposed fitting procedure, can be used to model efficiently Internet traffic in the relevant time scales, even when exhibiting LRD behavior.