sklearn.grid_search.
GridSearchCV
(estimator, param_grid, scoring=None, fit_params=None, n_jobs=1, iid=True, refit=True, cv=None, verbose=0, pre_dispatch='2*n_jobs', error_score='raise')[source]¶Exhaustive search over specified parameter values for an estimator.
Important members are fit, predict.
GridSearchCV implements a “fit” and a “score” method. It also implements “predict”, “predict_proba”, “decision_function”, “transform” and “inverse_transform” if they are implemented in the estimator used.
The parameters of the estimator used to apply these methods are optimized by cross-validated grid-search over a parameter grid.
Read more in the User Guide.
score
function,
or scoring
must be passed.scorer(estimator, X, y)
.
If None
, the score
method of the estimator is used.Number of jobs to run in parallel.
Changed in version 0.17: Upgraded to joblib 0.9.3.
Controls the number of jobs that get dispatched during parallel execution. Reducing this number can be useful to avoid an explosion of memory consumption when more jobs get dispatched than CPUs can process. This parameter can be:
- None, in which case all the jobs are immediately created and spawned. Use this for lightweight and fast-running jobs, to avoid delays due to on-demand spawning of the jobs
- An int, giving the exact number of total jobs that are spawned
- A string, giving an expression as a function of n_jobs, as in ‘2*n_jobs’
Determines the cross-validation splitting strategy. Possible inputs for cv are:
For integer/None inputs, if y
is binary or multiclass,
StratifiedKFold
used. If the estimator is a classifier
or if y
is neither binary nor multiclass, KFold
is used.
Refer User Guide for the various cross-validation strategies that can be used here.
>>> from sklearn import svm, grid_search, datasets
>>> iris = datasets.load_iris()
>>> parameters = {'kernel':('linear', 'rbf'), 'C':[1, 10]}
>>> svr = svm.SVC()
>>> clf = grid_search.GridSearchCV(svr, parameters)
>>> clf.fit(iris.data, iris.target)
...
GridSearchCV(cv=None, error_score=...,
estimator=SVC(C=1.0, cache_size=..., class_weight=..., coef0=...,
decision_function_shape=None, degree=..., gamma=...,
kernel='rbf', max_iter=-1, probability=False,
random_state=None, shrinking=True, tol=...,
verbose=False),
fit_params={}, iid=..., n_jobs=1,
param_grid=..., pre_dispatch=..., refit=...,
scoring=..., verbose=...)
Contains scores for all parameter combinations in param_grid. Each entry corresponds to one parameter setting. Each named tuple has the attributes:
parameters
, a dict of parameter settingsmean_validation_score
, the mean score over the cross-validation foldscv_validation_scores
, the list of scores for each fold
The parameters selected are those that maximize the score of the left out data, unless an explicit score is passed in which case it is used instead.
If n_jobs was set to a value higher than one, the data is copied for each point in the grid (and not n_jobs times). This is done for efficiency reasons if individual jobs take very little time, but may raise errors if the dataset is large and not enough memory is available. A workaround in this case is to set pre_dispatch. Then, the memory is copied only pre_dispatch many times. A reasonable value for pre_dispatch is 2 * n_jobs.
ParameterGrid
:sklearn.cross_validation.train_test_split()
:sklearn.metrics.make_scorer()
:__init__
(estimator, param_grid, scoring=None, fit_params=None, n_jobs=1, iid=True, refit=True, cv=None, verbose=0, pre_dispatch='2*n_jobs', error_score='raise')[source]¶Methods
__init__ (estimator, param_grid[, scoring, ...]) |
|
decision_function (*args, **kwargs) |
Call decision_function on the estimator with the best found parameters. |
fit (X[, y]) |
Run fit with all sets of parameters. |
get_params ([deep]) |
Get parameters for this estimator. |
inverse_transform (*args, **kwargs) |
Call inverse_transform on the estimator with the best found parameters. |
predict (*args, **kwargs) |
Call predict on the estimator with the best found parameters. |
predict_log_proba (*args, **kwargs) |
Call predict_log_proba on the estimator with the best found parameters. |
predict_proba (*args, **kwargs) |
Call predict_proba on the estimator with the best found parameters. |
score (X[, y]) |
Returns the score on the given data, if the estimator has been refit. |
set_params (**params) |
Set the parameters of this estimator. |
transform (*args, **kwargs) |
Call transform on the estimator with the best found parameters. |