LeetCode 831. Masking Personal Information (LaTeX)

We are given a personal information string S, which may represent either an email address or a phone number.

We would like to mask this personal information according to the following rules:

  1. Email address:

    We define a name to be a string of length ≥ 2 consisting of only lowercase letters a-z or uppercase letters A-Z.

    An email address starts with a name, followed by the symbol '@', followed by a name, followed by the dot '.' and followed by a name.

    All email addresses are guaranteed to be valid and in the format of "name1@name2.name3".

    To mask an email, all names must be converted to lowercase and all letters between the first and last letter of the first name must be replaced by 5 asterisks '*'.

  2. Phone number:

    A phone number is a string consisting of only the digits 0-9 or the characters from the set {'+', '-', '(', ')', ' '}. You may assume a phone number contains 10 to 13 digits.

    The last 10 digits make up the local number, while the digits before those make up the country code. Note that the country code is optional. We want to expose only the last 4 digits and mask all other digits.

    The local number should be formatted and masked as "***-***-1111", where 1 represents the exposed digits.

    To mask a phone number with country code like "+111 111 111 1111", we write it in the form "+***-***-***-1111". The '+' sign and the first '-' sign before the local number should only exist if there is a country code. For example, a 12 digit phone number mask should start with "+**-".

    Note that extraneous characters like "(", ")", " ", as well as extra dashes or plus signs not part of the above formatting scheme should be removed.

Return the correct “mask” of the information provided.


Example 1:

Input: "LeetCode@LeetCode.com"
Output: "l*****e@leetcode.com"
Explanation: All names are converted to lowercase, and the letters between the
             first and last letter of the first name is replaced by 5 asterisks.
             Therefore, "leetcode" -> "l*****e".

Example 2:

Input: "AB@qq.com"
Output: "a*****b@qq.com"
Explanation: There must be 5 asterisks between the first and last letter 
             of the first name "ab". Therefore, "ab" -> "a*****b".

Example 3:

Input: "1(234)567-890"
Output: "***-***-7890"
Explanation: 10 digits in the phone number, which means all digits make up the local number.

Example 4:

Input: "86-(10)12345678"
Output: "+**-***-***-5678"
Explanation: 12 digits, 2 digits for country code and 10 digits for local number. 


  1. S.length <= 40.
  2. Emails have length at least 8.
  3. Phone numbers have length at least 10.


\usepackage[a3paper, landscape]{geometry}



\tl_new:N \g_result_tl
\cs_generate_variant:Nn \regex_extract_once:NnNTF {NVNTF} 
\cs_generate_variant:Nn \regex_extract_all:nnN {nVN} 

\regex_new:N \g_email_regex
\regex_gset:Nn \g_email_regex {([a-zA-Z])([a-zA-z]*?)([a-zA-Z]{0,1}@[a-zA-z]+\.[a-zA-z]+)}
%\par email~regex:~\regex_show:N \g_email_regex

\cs_new:Npn \mask_info:n #1 {
    \tl_gset:Nx \g_tmpa_tl {\tl_lower_case:n {\tl_trim_spaces:n {#1}}}
    \par processed~input:~\tl_use:N \g_tmpa_tl

    \regex_extract_once:NVNTF \g_email_regex \g_tmpa_tl \g_tmpa_seq {
        \par email~found!
        \par submatch~stack:~\cs_meaning:N \g_tmpa_seq
        % construct result
        \tl_gclear:N \g_tmpa_tl
        \tl_gput_right:Nx \g_tmpa_tl {\seq_item:Nn \g_tmpa_seq {2}}
        \tl_gput_right:Nn \g_tmpa_tl {*****}
        \tl_gput_right:Nx \g_tmpa_tl {\seq_item:Nn \g_tmpa_seq {4}}
    } {
        \par phone~number~found!
        % match all digits
        \regex_extract_all:nVN {[0-9]} \g_tmpa_tl \g_tmpa_seq
        % construct string of digits
        \str_gclear:N \g_tmpa_str
        \seq_map_variable:NNn \g_tmpa_seq \l_tmpa_tl {
            \str_gput_right:NV \g_tmpa_str \l_tmpa_tl
        \par digits:~\str_use:N \g_tmpa_str
        % compute length of country code
        \exp_args:NNx \int_gset:Nn \g_tmpa_int {\str_count:N \g_tmpa_str - 10}
        % construct result
        \tl_gclear:N \g_tmpa_tl
        \tl_gput_right:Nx \g_tmpa_tl {***-***-\str_range:Nnn \g_tmpa_str {-4} {-1}}
        \int_compare:nNnTF \g_tmpa_int > {0} {
            % if has country code
            % need to construct start tokens first
            \tl_gclear:N \g_tmpb_tl
            \int_gset:Nn \g_tmpb_int {0}
            \int_do_while:nNnn \g_tmpb_int < \g_tmpa_int {
                \tl_gput_right:Nn \g_tmpb_tl {*}
                \int_gincr:N \g_tmpb_int
            \tl_gput_left:Nx \g_tmpa_tl {+ \g_tmpb_tl -}
    \tl_gset_eq:NN \g_result_tl \g_tmpa_tl

\mask_info:n {#1}
\par \textbf{result:~\tl_use:N \g_result_tl}






processed input: abcdef@leetcode.com
email found!
submatch stack: macro:->\s__seq \__seq_item:n {abcdef@leetcode.com}\__seq_item:n {a}\__seq_item:n {bcde}\__seq_item:n {f@leetcode.com}
result: a*****f@leetcode.com
processed input: gi@nba.com
email found!
submatch stack: macro:->\s__seq \__seq_item:n {gi@nba.com}\__seq_item:n {g}\__seq_item:n {}\__seq_item:n {i@nba.com}
result: g*****i@nba.com
processed input: 1(234)567-890
phone number found!
digits: 1234567890
result: ***-***-7890
processed input: 86-(10)12345678
phone number found!
digits: 861012345678
result: +**-***-***-5678
2020-06-13 20:43:33-04:00