Python积累——多线程的使用实例多线程编程是Python进阶开发中的核心技能之一。它允许程序同时执行多个任务显著提升I/O密集型应用的效率。本文将基于实际代码示例从基础到进阶系统讲解Python多线程的用法、注意事项及最佳实践。一、多线程的核心概念与优势什么是多线程多线程类似于同时执行多个不同程序每个线程共享进程的资源如内存、文件句柄但拥有独立的CPU寄存器上下文包括指令指针和堆栈指针。多线程的五大优势后台处理将耗时任务如大文件处理放到后台执行不阻塞主流程。提升用户体验在GUI程序中点击按钮触发任务时可显示进度条界面保持响应。加速程序运行在多核CPU上计算密集型任务可并行加速需注意GIL限制。高效等待在用户输入、文件读写、网络收发等场景下线程可主动让出资源。轻量级线程比进程更轻量创建和切换开销更小。二、Python多线程模块演进版本模块状态Python2thread已废弃Python3_thread底层兼容模块Python3threading推荐使用注意thread模块在Python3中被重命名为_thread仅用于向后兼容。生产环境应优先使用threading模块。三、基础实例Python2与Python3的对比示例1Python2的thread模块#!/usr/bin/python# -*- coding: UTF-8 -*-importthreadimporttimedefprint_time(threadName,delay):count0whilecount5:time.sleep(delay)count1print%s: %s%(threadName,time.ctime(time.time()))try:thread.start_new_thread(print_time,(Thread-1,2,))thread.start_new_thread(print_time,(Thread-2,4,))except:printError: unable to start threadwhile1:pass# 保持主线程存活示例2Python3的_thread模块兼容写法#!/usr/bin/python3import_threadimporttimedefprint_time(threadName,delay):count0whilecount5:time.sleep(delay)count1print(%s: %s%(threadName,time.ctime(time.time())))try:_thread.start_new_thread(print_time,(Thread-1,2,))_thread.start_new_thread(print_time,(Thread-2,4,))except:print(Error: 无法启动线程)while1:pass关键点使用start_new_thread()启动线程参数为函数名和参数元组。主线程必须保持存活通过while 1或time.sleep()否则子线程会被强制终止。四、推荐用法threading模块示例3继承threading.Thread类#!/usr/bin/python3importthreadingimporttime exitFlag0classmyThread(threading.Thread):def__init__(self,threadID,name,counter):threading.Thread.__init__(self)self.threadIDthreadID self.namename self.countercounterdefrun(self):# 重写run方法print(开始线程self.name)print_time(self.name,self.counter,5)print(退出线程self.name)defprint_time(threadName,delay,counter):whilecounter:ifexitFlag:threadName.exit()time.sleep(delay)print(%s: %s%(threadName,time.ctime(time.time())))counter-1thread1myThread(1,Thread-1,1)thread2myThread(2,Thread-2,2)thread1.start()thread2.start()thread1.join()# 等待线程结束thread2.join()print(退出主线程)核心方法start()启动线程自动调用run()。join()阻塞主线程直到子线程执行完毕。五、实战案例爬虫多线程批量处理示例4使用_thread实现多线程数据解析fromspider.dao.itemLinkDaoimport*fromspider.dao.categoryPageLinkDaoimport*importjsonfrombs4importBeautifulSoupimporttimeimportreimport_threaddefparserawauto(begin,size):linkheadhttp://www.525.life/linkend/mode_show?tokenuser_keyapp_version2.6.2.1while1:try:countcountNoDealedPageRaw()ifcount0:breakrawsfindNoDealedRawLimit(begin,size)forrawinraws:ifraw[source]食物库app:contentjsonjson.loads(raw[content])forfoodincontentjson[foods]:linklinkheadfood[code]linkend insertItemLink(food[code],food[name],raw[link],link,raw[type],raw[source])dealCategoryPageRaw(raw[link])else:soupBeautifulSoup(raw[content])divsoup.find(div,class_widget-food-list)uldiv.find(ul,class_food-list)forboxinul.find_all(div,class_text-box):nodebox.find(a,hrefre.compile(r/shiwu/\w))codenode[href].replace(/shiwu/,)namenode[title]linklinkheadcodelinkend insertItemLink(code,name,raw[link],link,raw[type],raw[source])dealCategoryPageRaw(raw[link])print(dealed %s %s %s%(raw[source],raw[type],raw[link]))exceptExceptionase:print(e)returnbegin str(begin) finishdatetime.now()defrun():# 启动20个线程每个处理不同的数据偏移foriinrange(0,2000,100):try:_thread.start_new_thread(parserawauto,(i,100))exceptExceptionase:print(e)print(Error: unable to start thread)run()while1:# 主线程保持运行pass设计亮点每个线程负责处理不同偏移量begin的数据实现并行抓取。内部while 1循环持续处理新数据直到队列为空。try-except捕获异常避免单个线程崩溃影响整体。六、线程同步锁机制示例5使用threading.Lock实现互斥#!/usr/bin/python3importthreadingimporttimeclassmyThread(threading.Thread):def__init__(self,threadID,name,counter):threading.Thread.__init__(self)self.threadIDthreadID self.namename self.countercounterdefrun(self):print(开启线程self.name)threadLock.acquire()# 获取锁print_time(self.name,self.counter,3)threadLock.release()# 释放锁defprint_time(threadName,delay,counter):whilecounter:time.sleep(delay)print(%s: %s%(threadName,time.ctime(time.time())))counter-1threadLockthreading.Lock()threads[]thread1myThread(1,Thread-1,1)thread2myThread(2,Thread-2,2)thread1.start()thread2.start()threads.append(thread1)threads.append(thread2)fortinthreads:t.join()print(退出主线程)输出效果Thread-1执行完毕后Thread-2才开始执行锁保证了顺序。七、线程优先级队列示例6使用queue.Queue管理任务#!/usr/bin/python3importqueueimportthreadingimporttime exitFlag0classmyThread(threading.Thread):def__init__(self,threadID,name,q):threading.Thread.__init__(self)self.threadIDthreadID self.namename self.qqdefrun(self):print(开启线程self.name)process_data(self.name,self.q)print(退出线程self.name)defprocess_data(threadName,q):whilenotexitFlag:queueLock.acquire()ifnotworkQueue.empty():dataq.get()queueLock.release()print(%s processing %s%(threadName,data))else:queueLock.release()time.sleep(1)threadList[Thread-1,Thread-2,Thread-3]nameList[One,Two,Three,Four,Five]queueLockthreading.Lock()workQueuequeue.Queue(10)threads[]fortNameinthreadList:threadmyThread(threadID,tName,workQueue)thread.start()threads.append(thread)threadID1# 填充任务队列queueLock.acquire()forwordinnameList:workQueue.put(word)queueLock.release()whilenotworkQueue.empty():passexitFlag1# 通知线程退出fortinthreads:t.join()print(退出主线程)适用场景任务量不确定的生产者-消费者模型。需要控制并发数量的爬虫系统。八、常见问题与避坑指南问题1Unhandled exception in thread started by原因主线程提前结束导致子线程被强制终止。解决方案确保主线程等待所有子线程完成。# 方法一使用join()thread1.join()thread2.join()# 方法二保持主线程运行while1:time.sleep(1)问题2GIL限制计算密集型任务Python的全局解释器锁GIL导致多线程无法并行执行CPU密集型代码。此时应使用multiprocessing模块。问题3死锁多个线程相互等待对方释放资源时发生。预防使用threading.RLock可重入锁或with语句管理锁。lockthreading.Lock()withlock:# 自动获取和释放锁critical_section()九、性能对比与选型建议场景推荐方案理由I/O密集型网络爬虫threading 队列线程切换开销低并发效果好CPU密集型计算multiprocessing绕过GIL利用多核高并发异步任务asyncio单线程协程更轻量级简单后台任务_thread或threading快速实现十、总结模块选择Python3中应优先使用threading模块。线程安全共享资源需加锁避免数据竞争。主线程管理必须确保主线程等待子线程结束否则会报错。任务队列queue.Queue结合多线程是生产者-消费者模式的最佳实践。异常处理每个线程内部需独立捕获异常防止影响其他线程。多线程是提升程序效率的利器但需注意其适用场景。在I/O密集型任务中多线程能显著提升吞吐量而在计算密集型任务中多进程或许是更好的选择。参考链接Python3多线程官方文档Python3 queue模块