python - How to use own algorithm to extract features in scikit-learn ( text feature extraction) -
i want use own algorithm extract features training data , fit , transform using countvectorize
in scikit-learn
.
currently doing:
from sklearn.feature_extraction.text import countvectorizer cvect_obj = countvectorizer() vects = cvect_obj.fit_transform(traning_data)
fit_transform(traning_data)
automatically extracts features , transforms it, want use own algorithm extract features.
actually quite not possible using directly though.as rule scikit-learn add well-established algorithms. rule of thumb @ least 3 years since publications, 200+ citations , wide use , usefullness. technique provides clear-cut improvement (e.g. enhanced data structure or efficient approximation) on widely-used method considered inclusion.
moreover, implementation doesn’t need in scikit-learn used scikit-learn tools, though. implement favorite algorithm in scikit-learn compatible way, upload github , listed under related projects.
Comments
Post a Comment