Current location - Loan Platform Complete Network - Big data management - How to build a source classifier with source domain data matlab
How to build a source classifier with source domain data matlab
train_data is the training feature data, train_label is the classification label.

Predict_label is the predicted label.

MatLab training data, get semantic label vector Scores (probability output).

1. Logistic Regression (MultiNomial logistic Regression)

Factor = mnrfit(train_data, train_label);

Scores = mnrval(Factor, test_data) ;

Scores is the semantic vector (probabilistic output). Can't eat it for high dimensional features.

2. Random Forest Classifier (Random Forest)

Factor = TreeBagger(nTree, train_data, train_label);

[Predict_label,Scores] = predict(Factor, test_data);

Scores is the semantic vector (probability output). In the experiment nTree = 500.

Works well, but a bit slow. 2500 rows of data, took 400 seconds. 5 million rows of big data analysis, what happens? Prepare a novel to read slowly ^_^

3. Simple Bayes classification (Naive Bayes)

Factor = NaiveBayes.fit(train_data, train_label);

Scores = posterior(Factor, test_data);

[Scores,Predict_label] = posterior(Factor, test_data);

Predict_label = predict(Factor, test_data);

accuracy = length(find(predict_label == test_label))/length(test_label)*100;

Doesn't work well.