Overview¶
The Multivariate Anomaly Simulation Engine (MASE) is one of two Python packages (along with MITTEN) developed by a team of three college students studying Computational Modeling and Data Analytics, Computer Science, Mathematics, and Statistics. Any issues can be reported on Github.
-
class
mase.Simulation(n_observations, means: numpy.ndarray = None, covariance_matrix: numpy.ndarray = None)¶ A Simulation object stores a Pandas DataFrame where the number of features is determined by the
meansorcovariance_matrixarguments supplied at initialization. If neithermeansnorcovariance_matrixis supplied, the number of features will be set ton_observationsthus the DataFrame will be square.Currently, MASE only supports simulation of multivariate normal data.
- Parameters
n_observations – number of observations to simulate
means – Optional; numpy array of means corresponding to each feature
covariance_matrix – Optional; numpy array of covariance matrix that you would like the simulated data to emulate
-
add_gaussian_observations(summary_df, feature_index, df=None, visualize=False, append=False)¶ - Parameters
summary_df –
Contains mean and standard deviation of gaussian distribution being added to a feature. Means are represented as a percentage of the standard deviation. Standard Deviations are represented as a percentage if itself.
For example:
mean
sd
n_obs
2.3
0
1.2
1.3
10
20
Feature at
feature_indexwill gain 10 Gaussian distributed observations with mean mean+2.3*sd and standard deviation 1.2*sd and 20 observations with mean 0 and standard deviation 1.3*sd. These observations will either be appended or overwritten depending on theappendparameter.feature_index – index of feature to be shifted
df – Optional; if not None, this method is being used as a function on a DataFrame rather than a method on a
Simulationobject.visualize – Optional; whether or not to plot the results
append – Optional; if True, new observations will be appended to the DataFrame. Else, trailing observations are overwritten.
-
get_data()¶ Getter for DataFrame of
Simulationobject- Returns
Pandas DataFrame