|
发表于 2023-10-14 16:17:28
|
显示全部楼层
本楼为最佳答案
 1、基于Sklearn机器学习工具或自编码实现经典逻辑回归(无正则化)、带有L1正则化和L2正则化的逻辑回归方法;并通过调整学习参数,比较其对分类精度的影响。
经典逻辑回归(无正则化)的实现:
- from sklearn.linear_model import LogisticRegression
- from sklearn.datasets import load_iris
- from sklearn.model_selection import train_test_split
- # 加载数据集
- iris = load_iris()
- X = iris.data[:, :2]
- y = iris.target
- # 划分数据集
- X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
- # 构建逻辑回归模型
- lr = LogisticRegression(penalty='none', solver='lbfgs', max_iter=1000)
- # 训练模型
- lr.fit(X_train, y_train)
- # 预测测试集
- y_pred = lr.predict(X_test)
- # 计算准确率
- accuracy = lr.score(X_test, y_test)
- print("Accuracy:", accuracy)
复制代码
带有L1正则化的逻辑回归方法实现:
- from sklearn.linear_model import LogisticRegression
- from sklearn.datasets import load_iris
- from sklearn.model_selection import train_test_split
- # 加载数据集
- iris = load_iris()
- X = iris.data[:, :2]
- y = iris.target
- # 划分数据集
- X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
- # 构建逻辑回归模型
- lr = LogisticRegression(penalty='l1', solver='saga', max_iter=1000)
- # 训练模型
- lr.fit(X_train, y_train)
- # 预测测试集
- y_pred = lr.predict(X_test)
- # 计算准确率
- accuracy = lr.score(X_test, y_test)
- print("Accuracy:", accuracy)
复制代码
带有L2正则化的逻辑回归方法实现:
- from sklearn.linear_model import LogisticRegression
- from sklearn.datasets import load_iris
- from sklearn.model_selection import train_test_split
- # 加载数据集
- iris = load_iris()
- X = iris.data[:, :2]
- y = iris.target
- # 划分数据集
- X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
- # 构建逻辑回归模型
- lr = LogisticRegression(penalty='l2', solver='lbfgs', max_iter=1000)
- # 训练模型
- lr.fit(X_train, y_train)
- # 预测测试集
- y_pred = lr.predict(X_test)
- # 计算准确率
- accuracy = lr.score(X_test, y_test)
- print("Accuracy:", accuracy)
复制代码
通过调整学习参数,比较其对分类精度的影响:
可以通过调整学习率、正则化系数等参数来影响分类精度。例如,可以使用GridSearchCV来搜索最优的参数组合:
- from sklearn.model_selection import GridSearchCV
- # 定义参数范围
- param_grid = {'penalty': ['none', 'l1', 'l2'],
- 'C': [0.001, 0.01, 0.1, 1, 10, 100],
- 'solver': ['lbfgs', 'saga']}
- # 构建逻辑回归模型
- lr = LogisticRegression(max_iter=1000)
- # 使用GridSearchCV搜索最优参数
- grid_search = GridSearchCV(estimator=lr, param_grid=param_grid, cv=5, n_jobs=-1)
- grid_search.fit(X_train, y_train)
- # 输出最优参数和准确率
- print("Best parameters:", grid_search.best_params_)
- print("Accuracy:", grid_search.best_score_)
复制代码
2、数据集:Iris数据集(选择两个特征,自己划分训练集和测试集);
加载数据集并选择两个特征:
- from sklearn.datasets import load_iris
- from sklearn.model_selection import train_test_split
- # 加载数据集
- iris = load_iris()
- X = iris.data[:, :2]
- y = iris.target
- # 划分数据集
- X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
复制代码
自己划分训练集和测试集,可以使用train_test_split函数。 |
|