p

python tutorial - Python Traversing Directories Recursively - learn python - python programming



How to get the home directory in Python ?

home = os.path.expanduser("~")
click below button to copy the code. By Python tutorial team
  • This will ensure it works on all platforms. Or we can do:
from os.path import expanduser
home = expanduser("~")
click below button to copy the code. By Python tutorial team

os.path.basename() vs. os.path.dirname

  • Both functions use the os.path.split(path) function to split the pathname path into a pair; (head, tail).
    • path="/foo/bar/item"
    • The os.path.dirname(path) function returns the head of the path.
>>> os.path.dirname(path)
'/foo/bar'
click below button to copy the code. By Python tutorial team
    • The os.path.basename(path) function returns the tail of the path.
>>> os.path.basename(path)
'item'
click below button to copy the code. By Python tutorial team
  • Note that if we have a slash('/') at the end of the path, we get different result:
    • path="/foo/bar/item/"
    • The os.path.dirname(path) function returns the head of the path.
>>> os.path.dirname(path)
'/foo/bar/item'
click below button to copy the code. By Python tutorial team
    • The os.path.basename(path) function returns the tail of the path.
>>> os.path.basename(path)
''
click below button to copy the code. By Python tutorial team

os.walk()

os.walk(top, topdown=True, onerror=None, followlinks=False)
click below button to copy the code. By Python tutorial team
  • The os.walk() generate the file names in a directory tree by walking the tree either top-down or bottom-up.
  • For each directory in the tree rooted at directory top, it yields a 3-tuple:
  • (dirpath, dirnames, filenames)
  • The dirpath is a string for the path to the directory. The dirnames is a list of the names of the subdirectories in dirpath (excluding '.' and '..'). The filenames is a list of the names of the non-directory files in dirpath.
  • Note that the names in the lists contain no path components. To get a full path (which begins with top) to a file or directory in dirpath, do os.path.join(dirpath, name).
  • In this section, we're going to use the tree below:
 Python Traversing Directories Recursively

Learn Python - Python tutorial - Python Traversing Directories Recursively - Python examples - Python programs

dirpath:

Sample Code

import os
for dirpath, dirs, files in os.walk("."):
	print dirpath
click below button to copy the code. By Python tutorial team

Output:

.
./A
./A/AA
./C
./B
./B/BB

dirs:

Sample Code

import os
for dirpath, dirs, files in os.walk("."):
	print dirs
click below button to copy the code. By Python tutorial team

Output:

['A', 'C', 'B']
['AA']
[]
[]
['BB']
[]

files:

Sample Code

import os
for dirpath, dirs, files in os.walk("."):
	print files
click below button to copy the code. By Python tutorial team

Output:

['f1']
['f3', 'f2']
['f4', 'f5']
['f9']
['f6']
['f7', 'f8']

Listing files in directories recursively?

Sample Code

import os
for dirpath, dirs, files in os.walk("."):	 
	path = dirpath.split('/')
	print '|', (len(path))*'---', '[',os.path.basename(dirpath),']'
	for f in files:
		print '|', len(path)*'---', f
click below button to copy the code. By Python tutorial team
  • Suppose we are now in TREE directory, then the output should look like this:

Output

| --- [ . ]
| --- f1
| --- f0.py
| ------ [ A ]
| ------ f3
| ------ f2
| --------- [ AA ]
| --------- f4
| --------- f5
| ------ [ C ]
| ------ f9
| ------ [ B ]
| ------ f6
| --------- [ BB ]
| --------- f7
| --------- f8
 Python Traversing Directories Recursively

Learn Python - Python tutorial - Python Traversing Directories Recursively - Python examples - Python programs

Recursive directory traversing

  • One of the answers may be to use os.walk() to recursively traverse directories.
  • So, in this section, we want to print all file contents recursively using the os.walk():
import os
for dirpath, dirs, files in os.walk("./TREE/"):	
	for filename in files:
		fname = os.path.join(dirpath,filename)
		with open(fname) as myfile:
			print(myfile.read())
click below button to copy the code. By Python tutorial team
  • The key here is to use os.path.join() when we read the files. Note that the names in the lists contain no path components.
  • To get a full path (which begins with top) to a file or directory in dirpath, do os.path.join(dirpath, filename).

The output from the code:

inside f1
...
inside f8

Recursive directory traversing 2

  • Here is another example. It reads in CMakeLists.txt, and generate a csv file with the targets.

Sample Code

import os

home = os.path.expanduser("~")
root_dir = os.path.join(home, "TEST/TF/tf")
cmake_path = os.path.join(root_dir, "CMakeLists.txt")

# make target list
def make_target_list():
   target_tag="set(TARGET_NAME"
   target_list = []
   for dirpath, dirs, files in os.walk(root_dir):
      if "CMakeLists.txt" in files:
         for f in files:
            cmake_path = os.path.join(dirpath, f)
            with open(cmake_path,'rb') as lines:
               for line in lines:
                  if target_tag in line:
                     target = line[line.find(target_tag)+len(target_tag)+1:line.find(")")]
                     target_list.append(target.strip('"'))
   return target_list

# writing csv
def write_csv(t):
   import csv
   with open('tf.csv', 'wb') as f:
      w = csv.writer(f, delimiter=' ')
      w.writerow(t)

if __name__ == '__main__':
   target = make_target_list()
   print target
   write_csv(target)
click below button to copy the code. By Python tutorial team

Output:

['assignment-client', 'ice-server', 'plugins', 'octree', 'embedded-webserver', 'audio', 'script-engine', 'entities-renderer', 'render-utils', 'model', 'render', 'animation', 'gpu', 'input-plugins', 'networking', 'fbx', 'ui', 'shared', 'avatars', 'display-plugins', 'entities', 'environment', 'audio-client', 'auto-updater', 'physics', 'gpu-test', 'render-utils-test', 'ui-test', 'shaders-test', 'entities-test', 'interface', 'gvr-interface', 'scribe', 'vhacd-util', 'mtc', 'domain-server']

The csv file looks like this:

$ cat tf.csv
assignment-client ice-server plugins octree embedded-webserver audio script-engine entities-renderer render-utils model render animation gpu input-plugins networking fbx ui shared avatars display-plugins entities environment audio-client auto-updater physics gpu-test render-utils-test ui-test shaders-test entities-test interface gvr-interface scribe vhacd-util mtc domain-server
click below button to copy the code. By Python tutorial team

Recursive directory traversing 3

  • I need to find files which have more than one unit of Google Ads (I am supposed to have only one of the type per page).
  • So, during the recursive file traversing, I have to include only (*.php) files not *.png, *.txt, etc. Also,
  • I have to count the occurrence of the Ad-unit in a file. If a file has more than one unit, this code prints out two things : full path and the count.

Sample Code

import os
for dirpath, dirs, files in os.walk("."): 
  for filename in files:
    fname = os.path.join(dirpath,filename)
    if fname.endswith('.php'):
      with open(fname) as myfile:
        line = myfile.read()
        c = line.count('bogo_sq')
        if c > 1:
          print fname, c
click below button to copy the code. By Python tutorial team

Output

../AngularJS/AngularJS_More_Directives_HTML_DOM.php 2
../DevOps/AWS/aws_S3_Simple_Storage_Service.php 2
../Android/android22Threads.php 2
../python/pytut.php 2

Related Searches to Python Traversing Directories Recursively