Find & Replace String In A Non-Ascii File.

Xah Lee, 2004-03-09

Previously we had find & replace string in a file. See http://xahlee.org/perl-python/find_replace.html

Though, that code won't work for non-ascii files. For example, text files of Chinese.

Here's how one'd do it for a file encoded with utf-16.

# -*- coding: utf-8 -*-
# Python

# find and replace many pairs of strings in sequence in a utf-16 file

filePath='/Users/t/web/p/x/x001.html'
outFile=filePath+'~-~'

findreplace = [
(u'<title>西游记</title>', u'<title>西游记 (Monkey King)</title>'),
]

inF = open(filePath,'rb')
s=unicode(inF.read(),'utf-16')
inF.close()

for couple in findreplace:
    outtext=s.replace(couple[0],couple[1])
    s=outtext

outF = open(outFile,'wb')
outF.write(outtext.encode('utf-16'))
outF.close()

Originall, this script is used to work on a html copy of a Chinese novel classic encoded in utf-16: http://xahlee.org/p/monkey_king/monkey_king.html. (These files has seen been changed to utf8 encoding.)

Here's a script that does find and replace of utf-8 files in all html files in a dir.

# -*- coding: utf-8 -*-
# Python

import os,sys,shutil

mydir= '/Users/t/web/p/xyz'

findreplace = [
('find1','replace1'),
('find2','replace2'),
]

def replaceStringInFile(filePath):
   "replaces all findStr by repStr in file filePath"
   print filePath
   tempName=filePath+'~x~'
   backupName=filePath+'~~'

   inF = open(filePath,'rb')
   s=unicode(inF.read(),'utf-8')
   inF.close()

   for couple in findreplace:
       outtext=s.replace(couple[0],couple[1])
       s=outtext
   outF = open(tempName,'wb')
   outF.write(outtext.encode('utf-8'))
   outF.close()

   shutil.copy2(filePath,backupName)
   os.remove(filePath)
   os.rename(tempName,filePath)

def myfun(dummy, dirr, filess):
    for child in filess:
        if '.html' == os.path.splitext(child)[1] and os.path.isfile(dirr+'/'+child):
            replaceStringInFile(dirr+'/'+child)
            print child

os.path.walk(mydir, myfun, 'dummy')

For a full-featured script that does find-replace in Perl, see: Find & Replace on Multiple Files with Perl


See also:


Page created: 2005-03.
© 2005 by Xah Lee.
Xah Signet