概统课上使用的python代码片段

import …

1
2
3
4
import numpy as np
import matplotlib.pyplot as plt
from scipy import stats
import matplotlib.style as style

画图的函数:

1
2
3
4
5
6
7
8
9
def plot(x,y,name='',path=''):
style.use('fivethirtyeight')
plt.figure(dpi=100)
plt.plot(x,y)
plt.axhline(y=0,color="black",linewidth=1.3,alpha=.7)
if name:
plt.title(name)
if path:
plt.savefig(path)

正态分布

1
N=stats.n

χ2\chi^2分布

生成

1
2
3
x=np.linspace(0,20,100)
y1=stats.chi2(5).pdf(x)
plot(x,y1,'$χ^2(5)$','x5.jpg')

置信区间

1
2
3
c=stats.chi2(144)
c.interval(0.95)
(112.67113138037668, 179.11367821420123)

T分布

生成

1
2
3
x2=np.linspace(-5,5,100)
y2=stats.t(5).pdf(x2)
plot(x2,y2,'$t(5)$','t5.jpg')

F分布

1
2
3
x3=np.linspace(0,6,300)
y3=stats.f(3,5).pdf(x3)
plot(x3,y3,'$F(3,5)$','f.jpg')

重要method:

methods 作用
pdf(x, df, loc=0, scale=1) Probability density function. 概率分布函数
!!! cdf(x, df, loc=0, scale=1) Cumulative distribution function. 累计分布函数
ppf(q, df, loc=0, scale=1) Percent point function (inverse of cdf — percentiles). cdf的逆函数
interval(alpha, df, loc=0, scale=1) Endpoints of the range that contains fraction alpha [0, 1] of the distribution
1
2
3
4
5
6
7
8
9
10
>>> from scipy import stats
>>> t=stats.t(5)
>>> t.interval(0.95)
(-2.5705818366147395, 2.5705818366147395)
>>> t.cdf(2.571)
0.9750126826580743
>>> t.cdf(2.571)-t.cdf(-2.571)
0.9500253653161486
>>> t.ppf(0.975)
2.5705818366147395

Methods

methods 作用
rvs(df, loc=0, scale=1, size=1, random_state=None) Random variates.
!!! pdf(x, df, loc=0, scale=1) Probability density function. 概率分布函数
logpdf(x, df, loc=0, scale=1) Log of the probability density function.
!!! cdf(x, df, loc=0, scale=1) Cumulative distribution function. 累计分布函数
logcdf(x, df, loc=0, scale=1) Log of the cumulative distribution function.
sf(x, df, loc=0, scale=1) Survival function (also defined as 1 - cdf, but sf is sometimes more accurate).
logsf(x, df, loc=0, scale=1) Log of the survival function.
!!!ppf(q, df, loc=0, scale=1) Percent point function (inverse of cdf — percentiles). cdf的逆函数
isf(q, df, loc=0, scale=1) Inverse survival function (inverse of sf).
moment(n, df, loc=0, scale=1) Non-central moment of order n
stats(df, loc=0, scale=1, moments=’mv’) Mean(‘m’), variance(‘v’), skew(‘s’), and/or kurtosis(‘k’).
entropy(df, loc=0, scale=1) (Differential) entropy of the RV.
fit(data) Parameter estimates for generic data. See scipy.stats.rv_continuous.fit for detailed documentation of the keyword arguments.
*expect(func, args=(df,), loc=0, scale=1, lb=None, ub=None, conditional=False, *kwds) Expected value of a function (of one argument) with respect to the distribution.
median(df, loc=0, scale=1) Median of the distribution.
mean(df, loc=0, scale=1) Mean of the distribution.
var(df, loc=0, scale=1) Variance of the distribution.
std(df, loc=0, scale=1) Standard deviation of the distribution.
interval(alpha, df, loc=0, scale=1) Endpoints of the range that contains fraction alpha [0, 1] of the distribution

numpy

x=np.array([1,2,3])

method result 意思
x.mean() 2.0 平均值
x.var() 0.6666 方差=np.sum(np.square(x-x.mean()))/len(x)
x.var(ddof=1) 1.0 S2=np.sum(np.square(x-x.mean()))/(len(x)-1)
x.var(ddof=d) xxx np.sum(np.square(x-x.mean()))/(len(x)-d)
-------------end-------------