Repairing¶

The clean cells are used as training examples to learn the parameters (weights) of a softmax regression model. Once those weights are defined, we use this model to perform inference on the “don’t-know” cells and insert the most likely value for each cell.

Softmax¶

class holoclean.learning.softmax.SoftMax(session, X_training)[source]¶

build_model(featurizers, input_dim_non_dc, input_dim_dc, output_dim, tie_init=True, tie_DC=True)[source]¶

Initializes the logreg part of our model

Parameters:	input_dim_non_dc – number of init + cooccur features featurizers – list of featurizers input_dim_dc – number of dc features output_dim – number of classes tie_init – boolean to decide weight tying for init features tie_DC – boolean to decide weight tying for dc features
Returns:	newly created LogReg model

log_weights()[source]¶

Writes weights in the logger

Returns:	Null

logreg(featurizers)[source]¶

Trains our model on clean cells and predicts vals for clean cells

Returns:	predictions

predict(model, x_val, mask=None)[source]¶

Runs our model on the test set

Parameters:	model – trained logreg model x_val – test x tensor mask – masking tensor to restrict domain
Returns:	predicted classes with probabilities

save_prediction(Y)[source]¶

Stores our predicted values in the database

Parameters:	Y – tensor with probability for each class
Returns:	Null

setupMask(clean=1, N=1, L=1)[source]¶

Initializes a masking tensor for ignoring impossible classes

Parameters:	clean – 1 if clean cells, 0 if don’t-know N – number of examples L – number of classes
Returns:	masking tensor

setuptrainingX(sparse=0)[source]¶

Initializes an X tensor of features for training

Parameters:	sparse – 0 if dense tensor, 1 if sparse
Returns:	x tensor of features

train(model, loss, optimizer, x_val, y_val, mask=None)[source]¶

Trains our model on the clean cells

Parameters:	model – logistic regression model loss – loss function used for evaluating performance optimizer – optimizer for our neural net x_val – x tensor - features y_val – y tensor - output for comparison mask – masking tensor
Returns:	cost of traininng