时间序列提取数据的问题,求大侠帮忙看下
有很多这样的时间序列"9:05,15793.91,4184206.09,10386.29,4189613.71,26180.20","9:06,17702.01,4182297.99,11081.01,4188918.99,28783.02","9:07,18687.24,4181312.76,12515.68,4187484.32,31202.92","9:08,20866.73,4179133.27,13653.44,4186346.56,34520.17","9:09,21924.33,4178075.67,14737.56,4185262.44,36661.89".....
比如,我要取出的9点过8分(9:08)的第二个数据(4179133.27),用正则能实现吗,
怎么取好点,求大侠帮忙!python 还没入门
本帖最后由 cflying 于 2022-12-5 19:39 编辑
这么规范的数据,直接字符串取前几位不就得了 看起来原始数据格式比较规整,估计可以直接转成字符列表。
你原始数据是什么格式,txt? >>> x = "9:08,20866.73,4179133.27,13653.44,4186346.56,34520.17"
>>> x.split(',')
'4179133.27'
你可以把原数据贴出来,看看是txt格式还是excel格式,txt的话一般都是用open()语句读取文件,如果是excel文件有xlrd模块读取文件,再就是比较万能的pandas模块读取文件,等读取之后就号解决了。 import re
s = """9:05,15793.91,4184206.09,10386.29,4189613.71,26180.20","9:06,17702.01,4182297.99,11081.01,4188918.99,28783.02","9:07,18687.24,4181312.76,12515.68,4187484.32,31202.92","9:08,20866.73,4179133.27,13653.44,4186346.56,34520.17","9:09,21924.33,4178075.67,14737.56,4185262.44,36661.89"""
t = re.findall(r"\d:\d\d.+?,(.+?),", s)
t
['4184206.09', '4182297.99', '4181312.76', '4179133.27', '4178075.67'] 每一分钟都有五个值,杀鸡焉用牛刀啊
页:
[1]