Sunday, February 17, 2013

Python multiple thread processing


from multiprocessing import Pool
pool = Pool(processes=5)
pages = pool.map(visit, get_lines(file))

Tuesday, February 12, 2013

List recently modified files under a directory in linux shell

In order to list files that have been modified recently, we could use the find command to retrieve the file information and sort them by modified date:

find . -type f -exec stat --format '%Y :%y %n' {} \; | sort -nr | cut -d: -f2- | head

However, this will miss some folders if the folders are symoblic links. So in this case we could specify find to follow symbolic links.

find -L . -type f -exec stat --format '%Y :%y %n' {} \; | sort -nr | cut -d: -f2- | head

Fourther more, if you just want to get files modified in last a few days, it's build in in find:

find . -mtime n

list files that modified n*24 hours ago.

Monday, February 4, 2013

The difference between soft link and hard link in linux

From: http://linuxgazette.net/105/pitcher.html


Unix files consist of two parts: the data part and the filename part.
The data part is associated with something called an 'inode'. The inode carries the map of where the data is, the file permissions, etc. for the data.
                               .---------------> ! data ! ! data ! etc
                              /                  +------+ !------+
        ! permbits, etc ! data addresses !
        +------------inode---------------+

The filename part carries a name and an associated inode number.
                         .--------------> ! permbits, etc ! addresses !
                        /                 +---------inode-------------+
        ! filename ! inode # !
        +--------------------+
More than one filename can reference the same inode number; these files are said to be 'hard linked' together.
        ! filename ! inode # !
        +--------------------+
                        \
                         >--------------> ! permbits, etc ! addresses !
                        /                 +---------inode-------------+
        ! othername ! inode # !
        +---------------------+
On the other hand, there's a special file type whose data part carries a path to another file. Since it is a special file, the OS recognizes the data as a path, and redirects opens, reads, and writes so that, instead of accessing the data within the special file, they access the data in the file named by the data in the special file. This special file is called a 'soft link' or a 'symbolic link' (aka a 'symlink').