Python, Parsing arguments with the packages argparse and getopt

Introduction

The essential features when learning quickly Python for an immediate use :

In this chapter, how to handle arguments in a Python program.

We want to create the program googleindex.py that gets the indexing status for a given URL, program with 2 arguments :

python3 googleindex.py --address <url>  [, --jsonauth <path to json auth file> ]

The first argument --address is required, the second one --jsonauth is optional.

The short options must be available :

python3 googleindex.py -a <url>  [, -j <path to json auth file> ]

The sys.argv array

The native system array sys.argv stores the informations about arguments: the indice 0 contains the script name, all the other indices the arguments.

import sys

print("Script name : %s" % (sys.argv[0]))
print("First argument : %s" % (sys.argv[1]))
print("All arguments : %s" % (sys.argv[1:]))
python3 googleindex.py test.html google.json
Script name : googleindex.py
First argument : test.html
All arguments : ['test.html', 'google.json']

sys.argv is a public global variable.

An argument can be checked with a try block :

import sys

try:
	arg = sys.argv[2]
except IndexError:
	raise SystemExit("Usage : %s url json" % (sys.argv[0]))

Using sys.argv is easy, but we need the usual syntax :

python3 googleindex.py --address test.html --jsonauth google.json
python3 googleindex.py -a test.html -j google.json

With sys.argv, it becomes a little bit complicated, each string is then an argument

Script name : googleindex.py
First argument : --address
All arguments : ['--address', 'test.html', '--jsonauth', 'google.json']

2 existing libraries handle sophisticated command line interfaces :

  • argparse
  • getopt

No installation needed, these 2 packages are integrated in the core Python engine.

argparse

The package argparse is imported and then an object argparse.ArgumentParser() is instantiated.

import argparse
parser = argparse.ArgumentParser()

The program name is available in the property parser.prog.

import argparse
parser = argparse.ArgumentParser()
print(parser.prog)
googleindex.py

Arguments are added with the method add_argument, very easy to use :

import argparse

parser = argparse.ArgumentParser()

parser.add_argument("--address", help="URL to be checked", required=True)
parser.add_argument("--jsonauth", help="JSON Google Authentication file path")
parser.add_argument("--verbosity", help="Verbosity", action="store_false")
  • to specify a mandatory argument : required=True
  • to apply a boolean constant value when the option is not specified : action="store_true | store_false". In the above example, verbosity will be False if --verbosity is not given, otherwise True.

A default value can be defined for optional parameters :

import argparse

parser = argparse.ArgumentParser()
parser.add_argument("--address", help="URL to be checked", required=True)
parser.add_argument("--jsonauth",
                      default="/home/sqlpac/google-auth.json",
                      help="JSON Google Authentication file path")
parser.add_argument("--verbosity", help="Verbosity", action="store_false")

Let’s see a first result

python3 googleindex.py
usage: learn-argparse.py [-h] --address ADDRESS [--jsonauth JSONAUTH] [--verbosity]
googleindex.py: error: the following arguments are required: --address

The option --help is immediately ready to use :

python3 googleindex.py --help
usage: googleindex.py [-h] --address ADDRESS [--jsonauth JSONAUTH] [--verbosity]

optional arguments:
  -h, --help           show this help message and exit
  --address ADDRESS    URL to be indexed
  --jsonauth JSONAUTH  JSON Google Authentication file path, default $HOME/google-auth.json
  --verbosity          Verbosity

Use the method parser_args to get argument values, the resulting object is a namespace and each property is an argument :

args = parser.parse_args()
print(args)
print(args.address)
print(args.jsonauth)
python3 googleindex.py --address 1.html --jsonauth google.json
Namespace(address='1.html', jsonauth='google.json', verbosity=False)
1.html
google.json

The property name can be modified with dest

import argparse

parser = argparse.ArgumentParser()
parser.add_argument("--address", help="URL to be checked", required=True)
parser.add_argument("--jsonauth",
                      default="/home/sqlpac/google-auth.json",
                      dest="jfile",
                      help="JSON Google Authentication file path")
parser.add_argument("--verbosity", help="Verbosity", action="store_false")

print(args)
print(args.jfile)
python3 googleindex.py --address 1.html --jsonauth google.json
Namespace(address='1.html', jfile='google.json', verbosity=False)
google.json

By default the property data type is a string. To force a datatype, use type=<datatype> when defining an argument, it enforces automatically argument data types validation rules :

import argparse

parser = argparse.ArgumentParser()
…
parser.add_argument("--year",
                      type=int,
                      default=2020, help="Year extraction")
                                            
…
usage: googleindex.py.py [-h] --address ADDRESS [--jsonauth JFILE] [--verbosity] [--year YEAR]
googleindex.py.py: error: argument --year: invalid int value: 'onestring'

To combine short and long options (-a, --address), just use add_argument('short-option','long-option',…)

import argparse

parser = argparse.ArgumentParser()

parser.add_argument("-a","--address", help="URL to be indexed", required=True)
…

Everything is ready in few code lines :

import argparse

parser = argparse.ArgumentParser()

parser.add_argument('-a','--address', help='URL to be indexed', required=True)
parser.add_argument("-j","--jsonauth",
					help="JSON Google Authentication file path, default $HOME/google-auth.json",
					default="/home/sqlpac/google-auth.json",
					dest="jfile")
parser.add_argument("-y","--year",
                      type=int,
                      default=2020,
                      help="Year extraction")
parser.add_argument("-v","--verbosity", help="Verbosity", action="store_false")

args = parser.parse_args()

if args.year :
  print('Current year selected')
…
python3 googleindex.py --address 1.html --jsonauth google.json --year 2020
Current year selected
python3 googleindex.py --help
usage: googleindex.py [-h] -a ADDRESS [-j JFILE] [-y YEAR] [-v]

optional arguments:
  -h, --help            show this help message and exit
  -a ADDRESS, --address ADDRESS
                        URL to be indexed
  -j JFILE, --jsonauth JFILE
                        JSON Google Authentication file path, default $HOME/google-auth.json
  -y YEAR, --year YEAR  Year extraction
  -v, --verbosity       Verbosity

nargs option

Sometimes we need to be able to get multiple values for an argument, for example :

python3 googleindex.py --address url1.html url2.html 

It can be achieved with the option nargs='*' :

import argparse

parser = argparse.ArgumentParser()

parser.add_argument('-a','--address', nargs='*', help='URLs to be indexed', required=True)
…

args = parser.parse_args()

print(args.address)
print(type(args))

Then the argument is not a string data type, but a list :

['url1.html', 'url2.html']
<class 'list'>

A fixed number of values is set with the option nargs=<int value> :

import argparse

parser = argparse.ArgumentParser()

parser.add_argument('-a','--address', nargs=3, help='URLs to be indexed', required=True)
…

args = parser.parse_args()

print(args.address)
print(type(args))

When the expected number of values is not the right one :

usage: googleindex.py [-h] -a ADDRESS [-j JFILE] [-y YEAR] [-v]
googleindex.py: error: argument -a/--address: expected 3 arguments

Parent arguments

Usually some programs share the same "parent" arguments and just add "child" arguments. We avoid code redundancy as much as possible. Use the argument parents when creating the child parser.

googleindex.py (parent script)
import argparse

def get_parser(h):
	parser = argparse.ArgumentParser(add_help=h)
	parser.add_argument("-a", "--address", nargs='*', help="URLs to be checked", required=True)
	parser.add_argument("-j", "--jsonauth",
						help="JSON Google Authentication file path, default $HOME/google-auth.json",
						default="/home/sqlpac/google-auth.json")
	
	return parser

if (__name__=="__main__"):
	p = get_parser(h=True)
	args = p.parse_args()
googleindexlang.py (child script)
import googleindex
import argparse

def main(p):
	child_parser = argparse.ArgumentParser(parents=[p], add_help=True)
	child_parser.add_argument('-l','--lang', help='Language')
	
	args = child_parser.parse_args()


if (__name__=="__main__"):
	p = googleindex.get_parser(h=False)
	main(p)
python3 googleindexlang.py --help
  -h, --help            show this help message and exit
  -a [ADDRESS [ADDRESS ...]], --address [ADDRESS [ADDRESS ...]]
                        URLs to be checked
  -j JSONAUTH, --jsonauth JSONAUTH
                        JSON Google Authentication file path, default $HOME/google-auth.json
  -l LANG, --lang LANG  Language

The option add_help is set to False when getting the parent parser in the child program, otherwise a conflict error is raised :

argparse.ArgumentError: argument -h/--help: conflicting option strings: -h, --help

getopt

The package getopt is less powerful than the package argparse and requires more code, code invoking the system array sys.argv.

  • Optional and required arguments must be checked manually
  • Data types must be checked also manually

But it seems good to know how to read/write and use this package.

import getopt, sys

def usage():
	print("Usage : %s --address <url> --jsonauth <json auth file path> --year <year selected> --version --help" % (sys.argv[0]))
	exit()

address = False
jsonauth = "/home/sqlpac/google-auth.json"
year = 2020
version = "1.0"

options, args = getopt.getopt(sys.argv[1:], 'a:j:y:h:v', ['address=',
														  'jsonauth=',
														  'year=',
														  'help',
														  'version'])
for opt, arg in options:
	if opt in ('-a', '--address'):
		address = arg
	elif opt in ('-j', '--jsonauth'):
		jsonauth = arg
	elif opt in ('-y', '--year'):
		year = arg
	elif opt in ('-v', '--version'):
		print(version)
		exit()
	elif opt in ('-h', '--help'):
		usage()

if not address:
	print('Address required')
	usage()

This piece of code does not need comments, it is easily readable for beginners knowing the list sys.argv[1:] stores the arguments.

When we are used to shell or C programming, we recognize the short options in the second argument given to the method getopt (a:j:y:h:v), the long options are set in the third argument (['address=','jsonauth=','help'…) with the operator = specified when a value is expected.

Conclusion

Undoubtedly, the package argparse is more powerful with less code than the package getopt. Which one ? It will depend on the preferences, the project's complexity…