crawler.reject_files
Do not crawl files with these extensions.
Key: crawler.reject_files
Type: List<String>
Can be set in: collection.cfg
Table of Contents
Description
This is a comma-separated list of file extensions to reject. The crawler will not download any file whose URL ends with an extension in this list.
Default Value
crawler.reject_files=asc,asf,asx,avi,bat,bib,bin,bmp,bz2,c,class,cpp,css,deb,dll,dmg,dvi,exe,fits,fts,gif,gz,h,ico,jar,java,jpeg,jpg,lzh,man,mid,mov,mp3,mp4,mpeg,mpg,o,old,pgp,png,ppm,qt,ra,ram,rpm,svg,swf,tar,tcl,tex,tgz,tif,tiff,vob,wav,wmv,wrl,xpm,zip,Z