Efficient Encoding Method for Combined Codes in the MWD Telemetry System

.


Introduction
Directional drilling is currently the most advanced drilling technology.MWD (Measurement While Drilling) telemetry technology is a core technology of directional drilling platforms.The MWD system measures and collects geological and drilling engineering parameters near the drill bit during the drilling process and transmits the collected data to the surface in real-time through the telemetry system for controlling the wellbore trajectory.Mud pulse telemetry technology uses circulating mud during the drilling process as a transmission channel.Information is transmitted to the surface by mud pressure pulses and is currently the most widely used MWD telemetry technology [1][2][3][4][5].
MPPM (Multipulse Position Modulation) [6][7][8][9] is a pulse code technology.Due to its high energy efficiency, MPPM is widely used in fields such as fiber optic communication and free-space communication because it can send more than ten bits of data with only a few pulses [10][11][12][13][14][15][16][17].The principle of MPPM is to divide the transmission time of the code symbol of a combination code into N time slots and only select M time slots to send pulses.The bit symbols (data with a bit width of K bits) it sends are encoded in the positions of these M time slots.MPPM coding can also be combined with PSK (Phase Shift Keying) to obtain higher transmission rates through multi-mode modulation [16,17].
CC (Combination Code) is a special type of MPPM that is widely used in the mud pulse telemetry system of MWD, especially for the transmission of telemetry data in ultradeep wells.This is because energy efficiency is very important for data transmission in ultra-deep wells.High energy efficiency can reduce the power consumption of downhole units and reduce the wear of pulse generators [18][19][20].Baker optimized the code symbols of conventional CC codes to obtain AC (Advanced Combination) codes [19,20].Compared with CC codes, AC codes have increased redundancy and therefore better decoding performance.This has been theoretically studied in [21].
At the downhole transmitter, bit symbols need to be mapped to corresponding code symbols before they can be sent through the pulse generator.However, it is difficult to establish a simple mapping relationship between bit symbols and CC code symbols, which makes it difficult to implement CC coding.Downhole transmitters usually generate encoding tables in advance, and the mapping from bit symbols to code symbols is implemented by table lookup.However, the space complexity of the encoding table increases geometrically as K increases.In MWD telemetry systems, the downhole working environment is harsh and the processor structure is relatively simple, making it difficult to cope with the space complexity when K is as high as 16 bits.TENKASI [22] proposed a method based on arithmetic code decompression to construct equiweight codes.Equiweight codes are channel codes with error detection capability.The number of bits equal to 1 in each code group remains constant.Therefore, equiweight codes can be used to construct MPPM symbols.Arithmetic coding is an entropy coding method with low complexity for both compression and decompression algorithms.Therefore, TENKASI's method can achieve lowcomplexity mapping of MPPM bit symbols.The time complexity of TENKASI's method is O(N 2 ) [22].QIN et al. [23] proposed an MPPM bit symbol mapping method for the case of M = 2.This method uses a lower triangular matrix with a certain regularity to map MPPM bit symbols to pulse positions.Each element in the matrix corresponds to a K-bit symbol, so simple mathematical calculations can be used to map K-bit binary symbols to the time slot positions of their corresponding two pulses.This method overcomes the disadvantage of exponential growth in the space complexity of encoding tables when the bit width is large, but it is only applicable to MPPM modulation with M = 2. Lin et al. [24] proposed an MPPM modulation and demodulation scheme based on an autoencoder model.The mapping from code symbols to pulse positions is achieved by training the autoencoder.The space complexity of the trained model is lower than that of the encoding table.However, the autoencoder model is only applicable to MPPM symbol modulation with specific (N, M) combinations.If the number of pulses and time slots changes, the model cannot meet the requirements of downhole combination code modulation and needs to be retrained.
Considering that MPPM (N, M) includes C M N pulse combinations, which can represent C M N different information symbols, this paper maps integer 0~C M N − 1 to a hyper-triangular region of a high-dimensional tensor and establishes a simple lookup table to achieve simple calculation relationship between the integer and its corresponding high-dimensional tensor subscripts.The subscript corresponding to the integer is the pulse position of the MPPM symbol corresponding to the integer.This achieves low-complexity mapping from bit symbols to MPPM symbols.MPPM can be easily extended to the commonly used CC code in MWD telemetry systems, namely MPCC code.MPCC code can significantly reduce the complexity of the CC code encoding end while maintaining the same error performance.
The organization of this paper is as follows.Section 2 first gives the mathematical definition of MPPM (N, M) code symbols.Then, a regular mapping relationship between bit symbols and MPPM code symbols is established to obtain an efficient MPPM coding method.Section 3 first gives the mathematical definition of CC codes and points out the relationship between MPPM and CC codes, as well as how to generate CC code symbols from MPPM code symbols.Then, the characteristics of AC codes and their relationship with CC codes are analyzed.In Section 4, a new MPCC (Multipulse Position Check Code) is designed.The encoding complexity of MPCC codes and AC codes is compared and analyzed.Finally, through simulation, it is verified that the error performance of MPCC codes is consistent with that of AC codes.Section 5 is a summary of the entire paper.

Efficient Encoding Method of MPPM
For convenience of description, we first give the mathematical definition of a time slot: Definition 1.A time slot is a time interval of length τ S .The time slot numbered i is denoted as: where i is the time slot number, t 0 + (i − 1)τ S is the start boundary of time slot i, and t 0 + iτ S is the end boundary of time slot i. N consecutive time slots form a coding interval for pulse position modulation s N = [s 1 , s 2 , . . ., s N ].The set Z N = {1, 2, . . . ,N} is the set of time slot numbers on the coding interval s N .
A subset of Z N is used in this paper to denote those time slots of the transmitted pulse in the code element symbol of pulse position modulation (coding).

Definition of MPPM Code Element Symbols
MPPM (N, M) encodes the bit symbols by utilizing the time slots of M transmit pulses arbitrarily selected from the N time slots of s N .The code symbols are defined as follows.
Definition 2. A pulse position sequence q = [q 1 , q 2 , . . ., q M ] is a code element symbol of MPPM (N, M) if it represents the time slot number of the transmitted pulse, i.e., q i ∈ Z N , i = 1, 2, . . ., M, and satisfies q 1 < q 2 < . . .< q M .The set Q = {q} consisting of all satisfying pulse position sequences on the coding interval s N is called the set of code element symbols of MPPM (N, M).
It is easy to obtain that the set of codeword symbols of MPPM (N, M) contains C M N different codeword symbols, and thus can encode bit symbols of bit width K = log , ⌊x⌋ denotes downward rounding x.The process of MPPM (N, M) encoding maps a K-bit symbol B to a codeword symbol in the set of MPPM (N, M) codeword symbols, i.e., bit-symbol mapping.Conventional bit-symbol mapping is achieved by using combinatorial mathematics to compute a sequence of all the pulse positions of N selected M and building an encoding table, and then implementing the bit-symbol mapping by looking up the table.The storage space complexity of the encoding table required for conventional bit-symbol mapping is high, especially when the bit width K of the bit symbols is large.The reason for this is the lack of a regular mapping relationship between bit symbols and codeword symbols, which prevents simple computation with bit symbols to obtain their corresponding pulse position sequences.

Efficient Encoding Method
Therefore, we consider mapping the integer 0~C M N − 1 to an element of an M-dimensional tensor T and using the index number of the tensor element in each dimension as the pulse position of the codeword symbol mapped from the integer.

M = 2
When M = 2, the tensor is a two-dimensional matrix, and we map the integers to the lower triangular region of the matrix (without diagonal elements).For example, for MPPM (5, 2), we map the integers 0~9 to the two-dimensional tensor T 5×5 as in Figure 1.
The first dimension is denoted as R 1 , and the second dimension is denoted as R 2 .Since the diagonal is not included, the index numbers of the two dimensions corresponding to each integer satisfy R 1 < R 2 , which satisfies the encoding condition of MPPM.Also, the total number of pulse combinations of MPPM (5, 2) is 10, which exactly maps one-to-one with the elements of the lower triangular part of T 5×5 .Therefore (R 1 , R 2 ) can be used as the integer corresponding to MPPM code element symbols, i.e., q =  Assuming that K-bit message of integer value D, the pulse number (R 1 , R 2 ) of the corresponding combined pulse code can be easily calculated from D. Assuming that the second dimension number of . Then, we can obtain Therefore, when we encode, we only need to establish a lookup table of the first integer value of each row of the second dimension.Thus, we can obtain R 2 according to the data to be encoded D, and then calculate R 1 from R 2 to obtain the encoding result of MPPM (N, 2).Theorem 1.For MPPM (N, 2), construct the matrix T N×N by mapping the sequence of integers 0, 1, 2, … to the elements of the matrix in row-first order, starting at row 2 of the matrix.The nth row maps n − 1 integers.The total number of elements in the lower triangular region of T N×N thus obtained is equal to the total number of pulse combinations of MPPM (N, 2).
According to Theorem 1, it is easy to obtain that the sequence of integers 0~C 2 N − 1 maps one-to-one with the elements of the lower triangular region of T N×N .

Proof of Theorem 1. T N×N maps the number of cells of integers as:
The number of pulse combinations for MPPM (N, 2) is Both are equal, so the conclusion holds.□ Theorem 1 shows that the sequence of integers 0, 1, 2, … can be sequentially mapped one by one to an element of the lower triangular region of T N×N .The row and column number of this element can exactly represent a sequence of pulse positions are taken, the mapping from K-bit symbols to MPPM (N, 2) code element symbols can be realized.

M > 2
For MPPM (N, M) with M > 2, we can construct an M-dimensional tensor T N × N × . . .× N M , whose indexes in each dimension are denoted as: R 1 , R 2 , ……, R M , respectively, where R i = 1, 2, . . ., N. Map the integers 0~C M N − 1 sequentially to the hyper-triangular region of T. The 1st dimension is mapped from the index serial number 1, the 2nd dimension is mapped from the serial number 2, … and the Mth dimension is mapped from the serial number M. That is, the integer 0 is mapped in T(1, 2, . . ., M).The index of each dimension of the elements of the mapped hyper-triangular region satisfies R 1 < R 2 < … < R M .Such a mapping ensures that the numbering on each dimension corresponding to the mapped integers satisfies the definition of the MPPM (N, M) code element notation, and thus allows for the representation of MPPM (N, M) pulse combinations.For example, the hypertriangular region of the MPPM (5, 3) mapping is shown in Figure 2.  Based on the mapping of MPPM (5, 2) and MPPM (5,3) to the triangular region on the two-and three-dimensional tensor T, the following law can be summarized: 1.
As can be seen from Figure 1, the two-dimensional triangular region of a two-dimensional tensor can be viewed as a cascade of multiple one-dimensional vectors.In the second dimension, the number of elements of the one-dimensional vectors mapped by indexes 2~N, respectively, is in order: 1, 2, 3, 4, … 2.
From Figure 2, it can be seen that the three-dimensional triangular region of a threedimensional tensor can be viewed as a cascade of multiple two-dimensional matrices.
In the third dimension, the number of elements of the two-dimensional triangular region mapped by indexes 3~N,respectively, is in order: 1, 3, 6, … 3.
By analogy, the m-dimensional tensor is viewed as a cascade of m − 1 dimensional tensors.The number of elements of the m − 1 dimensional hyper-triangular region mapped by index i = m, m + 1, . . ., N in the mth dimension is, respectively: The above rule can be strictly proved.To do this, first prove a theorem.
Proof of Theorem 2. From the constant equation Proof of Theorem 3. Prove by mathematical induction. 1.
On dimension two: integers are mapped onto elements of the lower triangular region of the two-dimensional tensor.The mapping starts at the beginning, so on dimension two, the number of integers mapped in the nth row is n − 1, n = 2, 3, . . ., N: 2.
On dimension three: the three-dimensional tensor can be viewed as consisting of a two-dimensional tensor layered on dimension three.The number of integers mapped on dimension three, starts from the beginning, and therefore the number of integers mapped on the nth layer is: 3.
Assuming that the conclusion holds in dimension m, start the mapping from R m = m, i.e., a Then, the mapping from R m+1 = m + 1 on dimension m + 1 has Let By Theorem 2 i.e., a Therefore, the conclusion of a , the number of elements of the mapped super-triangular region is equal to the total number of pulse combinations of MPPM (N, M) as C M N .
Proof of Theorem 4. The total number of integers mapped in the M-dimensional supertriangular region is: Let According to the conclusion of Theorems 2 and 3, it is then obtained that:

□
By Theorem 4, it follows that the pulse codewords of MPPM (N, M) map one-to-one with the supra-triangular region of the M-dimensional tensor.

LUT (Lookup Table) Algorithm for MPPM (N, M) Bit-Symbol Mapping
When mapping integers 0~C M N − 1 sequentially to the super triangular region of T, using a (m) n we can determine the number of integers mapped at each level in any dimension.From this, we can use a (m) n to construct a LUT, e.g., we can construct N max = 24, M max = 6 lookup table T LUT to satisfy the MPPM coding requirement of 1~16-bit width.It is a matrix with M max rows and N max columns.
A simple lookup table algorithm can be designed using U LUT to encode MPPM (N, M) of 1 to 16 bits, N ≤ N max , M ≤ M max .Since U LUT contains the number of elements in each layer of each dimension of the hyper-triangular region mapped on the M -dimensional tensor T, the index q = [R 1 , R 2 , . . ., R M ] of the integer corresponding to each dimension of T can be computed from any integer between 0~C M N − 1. Taking q as a codeword symbol of the MPPM (N, M) can realize the mapping of bit symbols to the codeword symbols of the MPPM (N, M), which can greatly reduce the space complexity at the coding end of the MPPM (N, M).The LUT algorithm for MPPM (N, M) coding is as follows Algorithm 1: Algorithm 1. LUT algorithm for MPPM (N, M) coding.
1: Initialization: 2: [q 1 , q 2 , . . . ,q M ] = 0 3: B = K-bit data to be encoded 4: Look up the table to calculate [q 1 , q 2 , . . ., q M ]: 5: Index of the 1st element greater than B in row M of R m = T LUT :q M = R m ; 6: for (m decreases from M to 2) 7: B = B − T LUT (m, R m − 1) 8: Index of the 1st element greater than B in the m − 1st row of R m = T LUT 9: q m−1 = R m 10: end Since each row of T LUT is arranged in ascending order, it has low lookup complexity.

CC Code and AC Code
The mud pulse channel operated by the MWD telemetry transmission system is a poor channel characterized by strong interference, nonlinearity, and multipath transmission.In order to improve the transmission performance in the mud pulse channel, the CC (N, M) code, which is widely used in the MWD telemetry transmission system, extends the minimum hourly slot spacing between neighboring pulses in the MPPM (N, M) code element symbols in order to reduce the interference between pulses caused by the mud channel.The code element symbols of the CC (N, M) code and its set of code element symbols are defined as [19,20]: Definition 3. If a sequence of pulse positions p = [p 1 , p 2 , . . ., p M ], where p i ∈ Z N , i = 1, 2, . . ., M and p 1 < p 2 < . . .< p M , satisfies the following conditions: 1.
the interval between adjacent pulses is at least 2 time slots, i.e., the difference between the time slot numbers of adjacent pulses is greater than or equal to three; 2.
the last two time slots of each code element symbol remain silent.This is the requirement that the interval between the last pulse and the first pulse of the next code element symbol be at least 2 time slots.
Then, p is called a code element symbol of a CC (N, M) code.The set P = {p} consisting of all satisfying pulse position sequences on the coding interval s N is called the set of code element symbols of a CC (N, M) code.
From the definition of MPPM (N, M) and CC (N, M), it can be easily obtained that by inserting two silent time slots to extend the pulse interval after each pulse of the code element symbol of MPPM (N, M), the code element symbol of CC (N + 2M, M) can be obtained.That is, assuming that q = [q 1 , q 2 , . . ., q M ] is the code element symbol of MPPM (N, M), the pulse interval expansion of q yields a new pulse sequence p = [p 1 , p 2 , . . ., p M ], where p i = q i + 2(i − 1).Then, p is a code element symbol of a CC (N + 2M, M) code.
Meanwhile, in order to further improve the pulse detection performance, Baker's AC code is preferred on the basis of CC code, and the code element symbols with M ∑ i=1 p i even or odd are selected to represent the information, i.e., it is equivalent to a parity check.At the same time, in order to improve the pulse energy, the AC code specifies that the pulse width is 3/2 time slots [19,20].The code element symbol of an AC code is shown in Figure 3

Multipulse Position Checksum Code (MPCC) 4.1. Principles of MPCC Coding
On the basis of solving the coding complexity of MPPM (N, M), this paper designs a coding method of combinational code with parity check capability, i.e., MPCC (Multipulse Position Checksum Code).The coding principle of MPCC code is shown in Figure 4.
12 0 [ , ,. Suppose B n is n bits of data to be encoded.Firstly, B n is parity-checked, expanded to n + 1 bits and then the pulse position is encoded with n + 1 bits of MPPM.Finally, the MPPM encoded pulse positions are extended with time slots, i.e., two time slots are inserted after each pulse to obtain a pulse coding sequence that satisfies the constraints of pulse width and time slots of Baker's AC code, and also has the capability of even-checking.MPCC encoding has the equivalent performance of the AC code, but with a reduced coding complexity.The encoding flow of MPCC is as follows: Assume that the data to be encoded are denoted as and n denotes the bit width.

Analysis of Coding Complexity
Assume that the bit width range of the bit symbols to be encoded at the coding side is k = 1, 2, . . ., K bits, and the bit symbols of each bit width are encoded using MPMM(N k , M k ).
According to Table 1, the LUT of MPCC code has M max = M k rows and N max = N k integers per row.Since the integers in each row of the LUT are in ascending order, the lookup table algorithm can use Binary Search Algorithm [25].The maximum number of Binary Search in a row is not more than ⌊log 2 N k ⌋ + 1. Encoding a k-bit symbol requires M k binary searches, and the total number of searches is not more than M k (⌊log 2 N k ⌋ + 1).Therefore, the time complexity of MPCC encoding is O(M k log 2 N k ).MPCC encoding only needs to store the LUT of size M k × N k , so its space complexity is O(M k N k ).For k-bit wide bit symbols, AC coding requires an encoding table of size 2 k to realize the mapping of bit symbols to pulse positions.The coding process of AC codes is to index the encoding table directly using the bit symbols, and thus the time complexity of AC coding is O(1), while AC coding requires the storage of the complete encoding table of size 2 k .Usually, we choose M k and N k to satisfy that the number of combinations of pulse positions C M k N k should be just greater than the total number of k-bit symbols, 2 k , in order to minimize the redundancy, i.e., 2 k ≤ C M N ≤ 2 k+1 .The space complexity of AC coding to store a k-bit encoding table can thus be obtained as O 2 k = O C M N .In practice, AC coding needs to store all the encoding tables corresponding to k = 1, 2, . . ., K bit widths, i.e., the total space size of encoding tables is K ∑ k=1 2 k , whereas MPCC only needs to store one LUT of size M k × N k .Therefore, compared with AC coding, although the time complexity of MPCC coding is slightly higher than that of AC coding, its space complexity is much smaller than that of AC coding, which effectively solves the problem that the space complexity of AC coding increases exponentially with the increase in the bit width k of bit-symbol.The spatial and temporal complexity of MPCC and AC coding schemes are shown in Figure 5.

BER Performance Analysis of MPCC and AC Codes
In order to verify that the BER performance of MPCC codes remains the same as that of AC codes despite the reduced coding complexity, BER performance simulations under Gaussian channels were carried out in this paper.The simulation conditions were as follows: 1.
The coding parameters of the MPPM code are shown in Table 2. 2.
The coding parameters of the AC code are shown in Table 3.After expanding the MPPM parameters from Table 2 to MPCC coding, the MPCC coding parameters were the same as the AC codes in Table 3. MPCC coding and AC coding were performed for random data streams with 4 bit, 8 bit, and 16 bit bit widths, respectively, and the modulated pulse signals were passed through the AWGN channel, and the SER (Symbol Error Ratio) performance curve is shown in Figure 6 by using E s /N 0 and error symbol rate to measure the error symbol performance.After expanding the MPPM parameters from Table 2 to MPCC coding, the MPCC coding parameters were the same as the AC codes in Table 3. MPCC coding and AC coding were performed for random data streams with 4 bit, 8 bit, and 16 bit bit widths, respectively, and the modulated pulse signals were passed through the AWGN channel, and the SER (Symbol Error Ratio) performance curve is shown in Figure 6 by using   / 0 and error symbol rate to measure the error symbol performance.It can be seen that regardless of whether data width is 4-bit, 8-bit or 16-bit, the error symbol performance of MPCC code is exactly the same as that of AC code.

Conclusions
Pulse combination coding technology based on AC codes is widely used in the field of telemetry measurement while drilling.To achieve mapping from bit symbols to code symbols, the encoding table needs to be stored in advance at the downhole encoding end.However, due to the complex and harsh downhole environment, the space complexity of the encoding table cannot be met by the downhole processor system.This paper constructs a new combination code, namely MPCC code.MPCC code maps bit symbols to code symbols through a lookup table with much lower space complexity than encoding tables.MPCC code effectively solves the problem of exponential growth of the space complexity of encoding tables with increasing bit width and has the same error performance as AC codes in Gaussian channels.Compared with AC codes, MPCC codes are more suitable for systems with limited storage resources on the transmitting side, such as MWD telemetry systems.It can be seen that regardless of whether data width is 4-bit, 8-bit or 16-bit, the error symbol performance of MPCC code is exactly the same as that of AC code.

Conclusions
Pulse combination coding technology based on AC codes is widely used in the field of telemetry measurement while drilling.To achieve mapping from bit symbols to code symbols, the encoding table needs to be stored in advance at the downhole encoding end.However, due to the complex and harsh downhole environment, the space complexity of the encoding table cannot be met by the downhole processor system.This paper constructs a new combination code, namely MPCC code.MPCC code maps bit symbols to code symbols through a lookup table with much lower space complexity than encoding tables.MPCC code effectively solves the problem of exponential growth of the space complexity of encoding tables with increasing bit width and has the same error performance as AC codes in Gaussian channels.Compared with AC codes, MPCC codes are more suitable for systems with limited storage resources on the transmitting side, such as MWD telemetry systems.

1 □ 3 .
Theorem Map the integers 0~C M N − 1 sequentially to the hyper-triangular region of m-dimensional tensor T. The m-dimensional hyper-triangular region can be regarded as a stack of m − 1 dimensional hyper-triangular regions.In dimension m, the number of integers on the m − 1 dimensional hyper-triangular region mapped by the nth layer where n

Figure 3 .
Figure 3. Schematic of AC code pulse position and pulse width.

Figure 5 .
Figure 5. Spatial and temporal complexity of MPCC and AC coding schemes.

Figure 6 .
Figure 6.Symbol error rate performance of MPCC and AC coding schemes.

AuthorFigure 6 .
Figure 6.Symbol error rate performance of MPCC and AC coding schemes. .

Table 2 .
MPPM Code M and N Parameter Table.

Table 3 .
AC Code M and N Parameter Table.