python 进程/线程池
python 进程/线程池

python 进程/线程池

1.线程和进程

进程只是占内存

线程才消耗CPU

2.全局解释器锁GIL

GIL是解释器用于同步线程的一种机制

它使得任何时刻仅有一个线程在执行(就算你是多核处理器也一样)

使用GIL的解释器只允许同一时间执行一个线程

常见的使用GIL解释器有CPythonRuby MRI

如果使用JPython就没有GIL锁

3.多线程测试CPU密集型

import time
from concurrent.futures import ThreadPoolExecutor, ProcessPoolExecutor
#CPU 裸奔
def start():
    """单纯计算  消耗CPU资源  没有实际意义"""
    data = 0
    for _ in range(1000000):
        data += 1
    return

def single_task():
    """单线程运行"""
    # 程序启动时间
    time_data = time.time()
    # 执行10次
    for _ in range(10):
        start()
    # 打印程序消耗时间
    single_consume_time = time.time() - time_data
    print("单线程耗时:", single_consume_time)
    return single_consume_time
def multithread_task():
    """多线程运行"""
    # 程序启动时间
    time_data = time.time()
    with ThreadPoolExecutor(max_workers=10) as pool:
        pool.map(start)
    multithread_consume_time = time.time() - time_data
    print("多线程耗时:", multithread_consume_time)
    return multithread_consume_time
def multiprocess_task():
    """多线程运行"""
    # 程序启动时间
    time_data = time.time()
    with ProcessPoolExecutor(max_workers=10) as pool:
        pool.map(start)
    multiprocess_consume_time = time.time() - time_data
    print("多进程耗时:", multiprocess_consume_time)
    return multiprocess_consume_time
if __name__ == "__main__":
    single_consume_time = single_task()
    multithread_consume_time = multithread_task()
    multiprocess_consume_time = multiprocess_task()
    print(single_consume_time > multithread_consume_time)
    print(single_consume_time > multiprocess_consume_time)
    print(multithread_consume_time > multiprocess_consume_time)

运行结果:

image-20221208153757239

可见耗时排名:单线程耗时>多进程耗时>多线程耗时

视频作者不是线程池的写法,我们参考一下:

import time
import threading
#CPU 裸奔
def start():
    """单纯计算  消耗CPU资源  没有实际意义"""
    data = 0
    for _ in range(1000000):
        data += 1
    return

def single_task():
    """单线程运行"""
    # 程序启动时间
    time_data = time.time()
    # 执行10次
    for _ in range(10):
        start()
    # 打印程序消耗时间
    single_consume_time = time.time() - time_data
    print("单线程耗时:", single_consume_time)
    return single_consume_time
def multithread_task():
    """多线程运行"""
    # 程序启动时间
    time_data = time.time()
    ts = {}
    for i in range(10):
        t = threading.Thread(target=start)
        t.start()
        ts[i] = t
    for i in range(10):
        ts[i].join()
    multithread_consume_time = time.time() - time_data
    print("多线程耗时:", multithread_consume_time)
    return multithread_consume_time
if __name__ == "__main__":
    single_consume_time = single_task()
    multithread_consume_time = multithread_task()

运行结果是两者时间差不多,几次下来有时单线程耗时多,有时多线程耗时多:

image-20221208154419863

线程池的写法为啥这么快,普通多线程的写法为啥这么慢?

以及join()的作用?——堵塞线程

4.什么是堵塞线程

5.非守护与守护线程

6.普通锁与递归锁

7.线程\进程池

from concurrent.futures import ThreadPoolExecutor, ProcessPoolExecutor
with ProcessPoolExecutor(max_workers=20) as pool:
    res = pool.map(run, company_list)

线程/进程池最大数量限制

经检测,多进程在windows端是有数量限制的,最大61个线程

with ProcessPoolExecutor(max_workers=500) as pool:
    res = pool.map(run, company_list)

image-20221208014150294

而在linux端则没有限制,以及在windows端的多进程也没有数量的限制

那么为了实现高并发的情况,并且不清楚任务数量的情况下,还是选择线程池比较理想,兼容性更好

至于上线,并不清楚,几万的并发是可以运行的,主要还是看主机的承受能力,我输入几亿的并发数量,Windows直接卡死了。。。

所以说,还是选择多进程比较方便。

image-20221208014612769

本人又拿公司的电脑测试了,我靠多线程居然又不是61个数量限制了,所以说,多线程还是看机器。

image-20221208015305588

执行顺序

一般来说是随机的,

发表回复

您的电子邮箱地址不会被公开。