@@ -810,6 +810,47 @@ const char *hts_parse_reg(const char *str, int *beg, int *end);
810810    @return      Pointer to the byte after the end of the entire region 
811811                 specifier (including any trailing comma) on success, 
812812                 or NULL if @a str could not be parsed. 
813+ 
814+     A variant of hts_parse_reg which is reference-id aware.  It uses 
815+     the iterator name2id callbacks to validate the region tokenisation works. 
816+ 
817+     This is necessary due to GRCh38 HLA additions which have reference names 
818+     like "HLA-DRB1*12:17". 
819+ 
820+     To work around ambiguous parsing issues, eg both "chr1" and "chr1:100-200" 
821+     are reference names, quote using curly braces. 
822+     Thus "{chr1}:100-200" and "{chr1:100-200}" disambiguate the above example. 
823+ 
824+     Flags are used to control how parsing works, and can be one of the below. 
825+ 
826+     HTS_PARSE_THOUSANDS_SEP: 
827+         Ignore commas in numbers.  For example with this flag 1,234,567 
828+         is interpreted as 1234567. 
829+ 
830+     HTS_PARSE_LIST: 
831+         If present, the region is assmed to be a comma separated list and 
832+         position parsing will not contain commas (this implicitly 
833+         clears HTS_PARSE_THOUSANDS_SEP in the call to hts_parse_decimal). 
834+         On success the return pointer will be the start of the next region, ie 
835+         the character after the comma.  (If *ret != '\0' then the caller can 
836+         assume another region is present in the list.) 
837+ 
838+         If not set then positions may contain commas.  In this case the return 
839+         value should point to the end of the string, or NULL on failure. 
840+ 
841+     HTS_PARSE_ONE_COORD: 
842+         If present, X:100 is treated as the single base pair region X:100-100. 
843+         In this case X:-100 is shorthand for X:1-100 and X:100- is X:100-<end>. 
844+         (This is the standard bcftools region convention.) 
845+ 
846+         When not set X:100 is considered to be X:100-<end> where <end> is 
847+         the end of chromosome X (set to INT_MAX here).  X:100- and X:-100 are 
848+         invalid. 
849+         (This is the standard samtools region convention.) 
850+ 
851+     Note the supplied string expects 1 based inclusive coordinates, but the 
852+     returned coordinates start from 0 and are half open, so pos0 is valid 
853+     for use in e.g. "for (pos0 = beg; pos0 < end; pos0++) {...}" 
813854*/ 
814855const  char  * hts_parse_region (const  char  * str , int  * tid , int64_t  * beg , int64_t  * end ,
815856                             hts_name2id_f  getid , void  * hdr , int  flags );
0 commit comments