Useful link: March 2013

Friday, March 29, 2013

Batch rename files in Bash

$ rename s/"SEARCH"/"REPLACE"/g *

This will replace the string SEARCH with REPLACE in every file (that is, *). The /g means global, so if you had a "SEARCH SEARCH.jpg", it would be renamed "REPLACE REPLACE.jpg". If you didn't have /g, it would have only done substitution once, and thus now named "REPLACE SEARCH.jpg". If you want case insensitive, add /i (that would be, /gi or /ig at the end).

With regular expressions, you can do lots more. For example, if you want to append something to every file: $ rename s/'^'/'MyPrefix'/ * That would add MyPrefix to the beginning of every filename. You can also do ending: $ rename s/'$'/'MySuffix'/ *

Also, the -n option will just show what would be renamed, then exit. This is useful, because you can make sure you have your command right before messing all your filenames up. :)

Thursday, March 28, 2013

How does one write code that best utilizes the CPU cache to improve performance?

The cache is there to reduce the number of times the CPU would stall waiting for a memory request to be fulfilled (avoiding the memory latency), and as a second effect, possibly to reduce the overall amount of data that needs to be transfered (preserving memory bandwidth).

Techniques for avoiding suffering from memory fetch latency is typically the first thing to consider, and sometimes helps a long way. The limited memory bandwidth is also a limiting factor, particularly for multicores and multithreaded applications where many threads wants to use the memory bus. A different set of techniques help addressing the latter issue.

Improving spatial locality means that you ensure that each cache line is used in full once it has been mapped to a cache. When we have looked at various standard benchmarks, we have seen that a surprising large fraction of those fail to use 100% of the fetched cache lines before the cache lines are evicted.

Improving cache line utilization helps in three respects:

It tends to fit more useful data in the cache, essentially increasing the effective cache size.
It tends to fit more useful data in the same cache line, increasing the likelyhood that requested data can be found in the cache.
It reduces the memory bandwidth requirements, as there will be fewer fetches.

Common techniques are:

Use smaller data types
Organize your data to avoid alignment holes (sorting your struct members by decreasing size is one way)
Beware of the standard dynamic memory allocator, which may introduce holes and spread your data around in memory as it warms up.
Make sure all adjacent data is actually used in the hot loops. Otherwise, consider breaking up data structures into hot and cold components, so that the hot loops use hot data.
avoid algorithms and datastructures that exhibit irregular access patterns, and favor linear datastructures.

We should also note that there are other ways to hide memory latency than using caches.

Modern CPU:s often have one or more hardware prefetchers. They train on the misses in a cache and try to spot regularities. For instance, after a few misses to subsequent cache lines, the hw prefetcher will start fetching cache lines into the cache, anticipating the application's needs. If you have a regular access pattern, the hardware prefetcher is usually doing a very good job. And if your program doesn't display regular access patterns, you may improve things by adding prefetch instructions yourself.

Regrouping instructions in such a way that those that always miss in the cache occur close to each other, the CPU can sometimes overlap these fetches so that the application only sustain one latency hit (Memory level parallelism).

To reduce the overall memory bus pressure, you have to start addressing what is called temporal locality. This means that you have to reuse data while it still hasn't been evicted from the cache.

Merging loops that touch the same data (loop fusion), and employing rewriting techniques known as tilingor blocking all strive to avoid those extra memory fetches.

While there are some rules of thumb for this rewrite exercise, you typically have to carefully consider loop carried data dependencies, to ensure that you don't affect the semantics of the program.

These things are what really pays off in the multicore world, where you typically wont see much of throughput improvements after adding the second thread.

answered Jun 19 '09 at 16:20

Mats N
43143

+1, this is excellent, thank you. – Antony Vennard Dec 29 '10 at 22:10

up vote22down vote

I recommend reading the 9-part article What every programmer should know about memory by Ulrich Drepper if you're interested in how memory and software interact. It's also available as a 104-page PDF.

Sections especially relevant to this question might be Part 2 (CPU caches) and Part 5 (What programmers can do - cache optimization).

answered Apr 18 '09 at 12:56

Tomi Kyöstilä
60358

Sunday, March 10, 2013

Python Time Conversion

From http://emilics.com/blog/article/python_time.html

Python Time Conversion

In python, there are four types that are commonly used to manage time. Timestamps, time tuples, datetime objects, and strings. Programmers often have to convert between these types depending on the situation or API. This article will describe all conversion patterns.

Python Time Conversion Table

Table 1 shows the python conversion table. Click on the pattern you want to see.

Table 1. python time conversion table

input \ output	datetime	time tuple	time stamp	string
datetime	—	datetime ↓ time tuple	datetime ↓ time stamp	datetime ↓ string
time tuple	time tuple ↓ datetime	—	time tuple ↓ time stamp	time tuple ↓ string
time stamp	time stamp ↓ datetime	time stamp ↓ time tuple	—	time stamp ↓ string
string	string ↓ datetime	string ↓ time tuple	string ↓ time stamp	—

datetime → time tuple

>>> dt = datetime.datetime(2010, 12, 31, 23, 59, 59)
>>> tt = dt.timetuple()
>>> print tt
time.struct_time(tm_year=2010, tm_mon=12, tm_mday=31, tm_hour=23, tm_min=59, tm_sec=59, ...)

datetime → time tuple

datetime → time stamp

>>> dt = datetime.datetime(2010, 12, 31, 23, 59, 59)
>>> ts = time.mktime(dt.timetuple())
>>> print ts
1293868799.0

datetime → time stamp

datetime → string

>>> dt = datetime.datetime(2010, 12, 31, 23, 59, 59)
>>> st = dt.strftime('%Y-%m-%d %H:%M:%S')
>>> print st
2010-12-31 23:59:59

datetime → string

time tuple → datetime

>>> tt = (2010, 12, 31, 23, 59, 59, 4, 365, 0)
>>> dt = datetime.datetime(tt[0], tt[1], tt[2], tt[3], tt[4], tt[5])
>>> print dt
2010-12-31 23:59:59
>>>
>>> dt = datetime.datetime(*tt[0:6])  # same with the code above
>>> print dt
2010-12-31 23:59:59

time tuple → datetime

time tuple → time stamp

>>> tt = (2010, 12, 31, 23, 59, 59, 4, 365, 0)
>>> ts = time.mktime(tt)
>>> print ts
1293868799.0

time tuple → time stamp

time tuple → string

>>> tt = (2010, 12, 31, 23, 59, 59, 4, 365, 0)
>>> st = time.strftime('%Y-%m-%d %H:%M:%S', tt)
>>> print st
2010-12-31 23:59:59

time tuple → string

time stamp → datetime

>>> ts = 1293868799.0
>>> dt = datetime.datetime.fromtimestamp(ts)     # for local time
>>> print dt
2010-12-31 23:59:59
>>>
>>> dt = datetime.datetime.utcfromtimestamp(ts)  # for UTC
>>> print dt
2011-01-01 07:59:59

time stamp → datetime

time stamp → time tuple

>>> ts = 1293868799.0
>>> tt = time.localtime(ts)
>>> print tt
time.struct_time(tm_year=2010, tm_mon=12, tm_mday=31, tm_hour=23, tm_min=59, tm_sec=59, ...)
>>>
>>> tt = time.gmtime(ts)
>>> print tt
time.struct_time(tm_year=2011, tm_mon=1, tm_mday=1, tm_hour=7, tm_min=59, tm_sec=59, ...)

time stamp → time tuple

time stamp → string

>>> ts = 1293868799.0
>>> st = datetime.datetime.fromtimestamp(ts).strftime('%Y-%m-%d %H:%M:%S')
>>> print st
2010-12-31 23:59:59
>>>
>>> st = datetime.datetime.utcfromtimestamp(ts).strftime('%Y-%m-%d %H:%M:%S')
>>> print st
2011-01-01 07:59:59

time stamp → string

string → datetime

>>> s = '2010-12-31 23:59:59'
>>> dt = datetime.datetime.strptime(s, '%Y-%m-%d %H:%M:%S')
>>> print dt
2010-12-31 23:59:59

string → datetime

string → time tuple

>>> st = '2010-12-31 23:59:59'
>>> tt = time.strptime(st, '%Y-%m-%d %H:%M:%S')
>>> print tt
time.struct_time(tm_year=2010, tm_mon=12, tm_mday=31, tm_hour=23, tm_min=59, tm_sec=59, ...)

string → time tuple

string → time stamp

>>> s = '2010-12-31 23:59:59'
>>> ts = time.mktime(time.strptime(s, '%Y-%m-%d %H:%M:%S'))
>>> print ts
1293868799.0

string → time stamp