Overview

The Multivariate Anomaly Simulation Engine (MASE) is one of two Python packages (along with MITTEN) developed by a team of three college students studying Computational Modeling and Data Analytics, Computer Science, Mathematics, and Statistics. Any issues can be reported on Github.

class mase.Simulation(n_observations, means: numpy.ndarray = None, covariance_matrix: numpy.ndarray = None)

A Simulation object stores a Pandas DataFrame where the number of features is determined by the means or covariance_matrix arguments supplied at initialization. If neither means nor covariance_matrix is supplied, the number of features will be set to n_observations thus the DataFrame will be square.

Currently, MASE only supports simulation of multivariate normal data.

Parameters
  • n_observations – number of observations to simulate

  • means – Optional; numpy array of means corresponding to each feature

  • covariance_matrix – Optional; numpy array of covariance matrix that you would like the simulated data to emulate

add_gaussian_observations(summary_df, feature_index, df=None, visualize=False, append=False)
Parameters
  • summary_df

    Contains mean and standard deviation of gaussian distribution being added to a feature. Means are represented as a percentage of the standard deviation. Standard Deviations are represented as a percentage if itself.

    For example:

    mean

    sd

    n_obs

    2.3

    0

    1.2

    1.3

    10

    20

    Feature at feature_index will gain 10 Gaussian distributed observations with mean mean+2.3*sd and standard deviation 1.2*sd and 20 observations with mean 0 and standard deviation 1.3*sd. These observations will either be appended or overwritten depending on the append parameter.

  • feature_index – index of feature to be shifted

  • df – Optional; if not None, this method is being used as a function on a DataFrame rather than a method on a Simulation object.

  • visualize – Optional; whether or not to plot the results

  • append – Optional; if True, new observations will be appended to the DataFrame. Else, trailing observations are overwritten.

get_data()

Getter for DataFrame of Simulation object

Returns

Pandas DataFrame