[SOLVED] Count the number of timestamp instances with an interval in Python

Issue

I have a text-file with the following timestamps:

0:01

0:02

0:02

0:02

0:03

2:05:52

2:05:52

2:05:52

2:05:53

2:05:53

2:05:53

2:05:53

2:05:54

2:05:54

2:05:54

2:05:54

Currently, I have a dictionary set up that counts each instance and counts them. The output [2:05:54, 4]. Which is great and all and ranks the most occurrence. However, a problem I noticed is if I don’t group them in some kind of interval, a 30 sec segment will take up all the space. I can have in theory and currently, timestamps 1:00 – 1:30 taking up all the space. Which is why I want to group them in some kind of interval, hopefully with Pandas? What I see from Pandas is I need to have it in MM-DD-YYYY TIMESTAMP, which is something I can not do.

Solution

First you need to clean your data. I don’t know whether "0:01" means 1 second after midnight or one minute after, and neither does Pandas. Write it as "0:00:01" or "0:01:00" as appropriate. Then try this:

df = pd.read_table('mydata.txt', header=None)
df.index = pd.to_timedelta(df[0]) # convert from strings
df.resample('1 min').count()

The output is:

           0
0           
00:00:01   5
00:01:01   0
00:02:01   0
00:03:01   0
00:04:01   0
...       ..
02:01:01   0
02:02:01   0
02:03:01   0
02:04:01   0
02:05:01  11

Ref: https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.resample.html

Answered By – John Zwinck

Answer Checked By – Clifford M. (BugsFixing Volunteer)

Leave a Reply

Your email address will not be published. Required fields are marked *