Next Article in Journal
Monitoring and Analyzing Mountain Glacier Surface Movement Using SAR Data and a Terrestrial Laser Scanner: A Case Study of the Himalayas North Slope Glacier Area
Previous Article in Journal
A Concave Hull Methodology for Calculating the Crown Volume of Individual Trees Based on Vehicle-Borne LiDAR Data
Previous Article in Special Issue
SDF-MAN: Semi-Supervised Disparity Fusion with Multi-Scale Adversarial Networks
Open AccessArticle
Peer-Review Record

Decision Fusion Framework for Hyperspectral Image Classification Based on Markov and Conditional Random Fields

Remote Sens. 2019, 11(6), 624; https://doi.org/10.3390/rs11060624
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Reviewer 3: Anonymous
Remote Sens. 2019, 11(6), 624; https://doi.org/10.3390/rs11060624
Received: 29 January 2019 / Revised: 6 March 2019 / Accepted: 8 March 2019 / Published: 14 March 2019
(This article belongs to the Special Issue Multisensor Data Fusion in Remote Sensing)

Round 1

Reviewer 1 Report

The manuscript is certainly interesting. Unfortunately, some parts of the exposition present problems. It is a real shame that the notation and the equations are sometimes unclear. 


I hope that the following suggestions can help to improve the exposition.

1)MRF and CRF acronyms are introduced in the abstract but not in the body text.

2)Only the acronyms MRF and CRF are reported in the keywords.

3)In line 35, it not so evident that spatial-spectral expression refers to the concept previously exposed.

4)The vector x is introduce in line 114 but the index n is introduced in line 137. I would like to suggest considering adopting an uppercase symbol to indicate number of pixel. 

5)The symbol N_i in Eq. 1 is hastily introduced in line 122. Perhaps, this symbol should be bold since it represent a set of pixels.

6)The formulation of the second term of Eq.1 could puzzle the reader. 

7)Perhaps lines 137 – 140 could be moved at beginning of section 2.2.

8)I would like to propose some changes in the lines 133 – 136: "In this work, the MRFs and CRFs are used as decision fusion methods, by combining multiple decision sources in their energy functions. We propose the fusion of two decision sources. The first is the probability outputs from the Multinomial Logistic Regression classifier (MLR) [36], i.e. a supervised classification of the spectral reflectance values. The second source of information is produced by considering the sparse spectral unmixing method SunSAL proposed in [29] ". Lines 141 and 144 should be adjusted accordingly.

9)I would like to suggest considering fusing section 2.1.1 with 2.3 and section 2.1.2 with 2.4

10) Section 2.3 and 2.4 contain a very complex notation and the figures are not completely clear. I would like to encourage the authors to improve this part. The authors consider whether this part of the manuscript could be extended making it more independent from the references.   

11) MRF_a, MRF_p, CRF_a, CRP_p are not clear. I can suppose, for instance, that MRF_a is the MRF applied to output of SunSAL.  

12) SVM classifier mentioned in Figure 6 could be added to the other approaches considered in the analysis of the performance of the proposed methods. Clearly, a different up-to-date classifier that does not adopt fusion approach would be equally suitable.


Indeed, I was particularly impressed by the quality of Figures and in particular by Figure 6. I am curious to know which program was used to produce this figure. Furthermore, I would like to know if the authors took inspiration from papers already published or they independently developed the layout of this figure.

Each plot of Figure 5 should have the same point of view and the same axis orientation. In addition, the vertical scale should be the same. In Figure 7 and Figure 8, the legends can be omitted. It could be sufficient that labels in x-axis were bold i.e. more evident. In this perspective, also the colors can be removed.


Furthermore, I would like to suggest checking ALL the bibliography because there are some problems. In particular:

1)In [25], an author is missing.

2)The authors of [27] are wrong.

3)In [44], the first name of the first author is also reported.  


Indeed, I have been really interested in this manuscript so that I would like to provide the most interesting idea that popped in my head when I read it. Indeed, I am glad when the reviewing process comes close to a friendly discussion among colleagues. I hope that the authors can find my suggestion interesting.


Thinking about the characteristics of the proposed method, I came to the conclusion that Figure 6, Table II and Table III are not completely effective to synthesize the results. In particular, Figure 6 gives a limited contribution to understand the behavior of the proposed solution. For this reason, I would like to suggest considering a new table. I hope that the HYPOTETHICAL considerations reported about the table can be confirmed by the real data. In any case, they should be useful as a base for the discussion. If this new table and the corresponding discussion were really interesting, they could be included in the revised version of the manuscript. 

The decision fusion framework considers SunSAL and MLR. Applying a MAP classifier to the abundances and probabilities obtained by SunSAL and MLR respectively, the corresponding classifiers are defined. Therefore, for each class C_k of the classification map, for each pixel P(I,j), these three events can occur: 


Event E1: both classifiers (SunSAL and MLR) mark P(i,j) as C_k.

Event E2: only one classifier marks P(i,j) as C_k.

Event E3: none classifier marks P(i,j) as C_k. 


For each C_k:

S(C_k,E1) is the set of pixels of Event E1. 

S(C_k,E2) is the set of pixels of Event E2.

S(C_k,E3) is the set of pixels of Event E3.  


And

n_(C_k,E1) is number of pixels in S(C_k,E1). 

n_(C_k,E2) is number of pixels in S(C_k,E2).  

n_(C_k,E3) is number of pixels in S(C_k,E3).  


The first column of the new table should report the labels of the classification (C1, C2, …, C10 in Figure 6) map.

The columns 2-4 should report n_(C_k,E1), n_(C_k,E2) and n_(C_k,E3) values. Indeed, column 4 is not particularly important.


The columns 5-7 should contain: 

The number of pixels nT_(C_k,E1)  of S(C_k,E1) that are really C_k.

The number of pixels nT_(C_k,E2)  of S(C_k,E2) that are really C_k .

The number of pixels nT_(C_k,E3)  of S(C_k,E3) that are really C_k.


The columns 8-10 should contain:

The number of pixels nFu_(C_k,E1) in (C_k,E1)  that the decision fusion framework marks C_k. 

The number of pixels nFu_(C_k,E2) in (C_k,E2)  that the decision fusion framework marks C_k. 

The number of pixels nFu_(C_k,E3) in (C_k,E3)  that the decision fusion framework marks C_k. 


The columns 11-13 should contain:

The number of pixels nFuT_(C_k,E1) in (C_k,E1) that the decision fusion framework marks C_k  and that are really C_k. 

The number of pixels nFuT_(C_k,E2) in (C_k,E2) that the decision fusion framework marks C_k  and that are really C_k. 

The number of pixels nFuT_(C_k,E3) in (C_k,E3) that the decision fusion framework marks C_k and that are really C_k. 


Reasonable the following facts should be usually observed:

n_(C_k,E1) should be greater n_(C_k,E2) because the two classifiers should be generally in accordance.

nT_(C_k,E1) should be usually just smaller than n_(C_k,E1) . When both classifiers provide the same class, it should be the right class. Indeed, this fact should be all the more true as the abundance and the probability chosen are much greater than the other ones.

nFu_(C_k,E1) should be more or less equal to n_(C_k,E1). When the classifiers provide the same class, the decision fusion framework should confirm the previous choice.

nFuT(C_k,E1) should be more or less equal to nT_(C_k,E1). When the classifiers provide the same class, the decision fusion framework should confirm the previous choice and it should be the right choice.

Since also E3 event is analyzed, some anomalies could be highlighted. In particular nFu_(C_k,E3) could not be zero.


The comparison between n_(C_k,E2), nFu_(C_k,E2) and nFuT_(C_k,E2) should be particularly interesting. When the two classifiers produce discordant results, in fact, the decision fusion framework should resolve the conflict taking the right decision.   



In my humble opinion, the proposed table should be help to understand the results obtained. For the sake of brevity, I take into consideration only University of Pavia dataset.


             Class        = [  1         2         3       4         5       6         7       8         9]

Figure6-TableA_i,i = [22.40 66.60 32.10 85.60 83.10 30.30 47.50 20.80 88.10];

                SunSAL = [33.04 68.95 60.59 84.61 95.43 46.80 48.80 36.74 98.61];

                      MLR = [50.72 67.17 70.93 88.95 97.81 55.85 80.75 61.60 95.52];

                    CRFL = [77.86 88.74 92.71 89.51 99.28 63.07 96.55 58.25 99.97];


n_(C_2,E2) should be great and decision fusion framework produces relevant improvement.

n_(C_4,E2) should be small so that decision fusion framework cannot work.

The new table could play a key role to understand the results of k=8 class.


Further interesting consideration can be developed if the event E2 is divided in two different event: E2A when SunSal marks as C_k and E2B when MLR marks as C_k. Carrying out this analysis, it should be possible to assess which classifier plays a key-role in the decision fusion process. This analysis could be useful when the parameters are tuning.  



Author Response

We thank the reviewers for their comments and suggestions for improvement. Below is a point-by-point reply to the reviewers. In the manuscript, all changes are denoted in red.

 

Reviewer 1

1)MRF and CRF acronyms are introduced in the abstract but not in the body text.

We now have introduced the MRF and CRF acronyms in the body text.

2)Only the acronyms MRF and CRF are reported in the keywords.

We now have used the full names in the keywords.

3)In line 35, it not so evident that spatial-spectral expression refers to the concept previously exposed.

We now made this more clear.

4)The vector x is introduce in line 114 but the index n is introduced in line 137. I would like to suggest considering adopting an uppercase symbol to indicate number of pixel. 

Since a lowercase symbol has been consistently used in the text as a pixel index, we prefer to keep this.
We now defined n as the number of pixels in line 114.

5)The symbol N_i in Eq. 1 is hastily introduced in line 122. Perhaps, this symbol should be bold since it represent a set of pixels.

We have adopted this suggestion throughout the manuscript.

6)The formulation of the second term of Eq.1 could puzzle the reader. 

We now split the double sum into 2 separate sums. We hope that this reduces possible confusion.

7)Perhaps lines 137 – 140 could be moved at beginning of section 2.2.

We followed the suggestion.

8)I would like to propose some changes in the lines 133 – 136: "In this work, the MRFs and CRFs are used as decision fusion methods, by combining multiple decision sources in their energy functions. We propose the fusion of two decision sources. The first is the probability outputs from the Multinomial Logistic Regression classifier (MLR) [36], i.e. a supervised classification of the spectral reflectance values. The second source of information is produced by considering the sparse spectral unmixing method SunSAL proposed in [29] ". Lines 141 and 144 should be adjusted accordingly.

We have made these changes.

9)I would like to suggest considering fusing section 2.1.1 with 2.3 and section 2.1.2 with 2.4

We would like to explicitly make the distinction between using the graphical models as regularizers (2.11. and 2.1.2) and using them as decision fusion methods (2.3 and 2.4). In our opinion, merging these sections would weaken this statement.

10) Section 2.3 and 2.4 contain a very complex notation and the figures are not completely clear. I would like to encourage the authors to improve this part. The authors consider whether this part of the manuscript could be extended making it more independent from the references.   

Given that the concepts of using MRF and CRF graphical models have been used before for decision fusion, we think it is only fair to properly cite these works. Also, the concept of cross links has been used before in a paper on the fusion of multispectral and Lidar data [reference 28]. In our paper, we apply the same concept for the fusion of the two decision sources from hyperspectral images, so in the text, we explicitly refer to that reference for more details. We tried to explain the procedure as clearly as possible. We do not believe that a thorough discussion on the optimization by graph-cut expansion or on the choice of a contrast sensitive Pots model would improve the readability of the manuscript, and prefer to refer to the proper literature for these.

We did include a complexity analysis of the methods in the revised version.

11) MRF_a, MRF_p, CRF_a, CRP_p are not clear. I can suppose, for instance, that MRF_a is the MRF applied to output of SunSAL.  

The reviewer is right. We clarified this in the new version.

12) SVM classifier mentioned in Figure 6 could be added to the other approaches considered in the analysis of the performance of the proposed methods. Clearly, a different up-to-date classifier that does not adopt fusion approach would be equally suitable.

The reviewer is right. Since the MLR classifier is used in the fusion approaches (a SVM classifier with a soft classification output could be used as well), it seemed only logical to show that one in the tables.

Indeed, I was particularly impressed by the quality of Figures and in particular by Figure 6. I am curious to know which program was used to produce this figure. Furthermore, I would like to know if the authors took inspiration from papers already published or they independently developed the layout of this figure.

Figure 6 was produced using the standard Matlab built in method: confusionmat in a combination with the plotConfMat method from Vahe Tshitoyan (20/08/2017). His method can be downloaded from here:

https://github.com/vtshitoyan/plotConfMat/blob/master/plotConfMat.m

https://www.mathworks.com/matlabcentral/mlcdownloads/downloads/submissions/67631/versions/2/previews/plotConfMat.m/index.html

Each plot of Figure 5 should have the same point of view and the same axis orientation. In addition, the vertical scale should be the same. In Figure 7 and Figure 8, the legends can be omitted. It could be sufficient that labels in x-axis were bold i.e. more evident. In this perspective, also the colors can be removed.

We agree with the reviewer and adapted the figures accordingly. The plots in Figure 5 now have the same point of view and axis orientation. The boxplots of figure 7 and 9, the legends are omitted, and labels on the x-axis are in bold. We opted to retain colors. We realize that the original colors were not very  distinguishable, therefore we improved the visualization by choosing a more distinguishable color palette following visualization color guidelines:

https://blog.graphiq.com/finding-the-right-color-palettes-for-data-visualizations-fcd4e707a283}{Right color palettes for data visualization}

https://www.perceptualedge.com/articles/b-eye/choosing_colors.pdf}{Choosing colors}

With these changes, each method has its own very distinct color, which facilitates locating the accuracies of each particular method with a very short glance at the plot, compared to when they all would be in black.

Furthermore, I would like to suggest checking ALL the bibliography because there are some problems. In particular:

1)In [25], an author is missing.

2)The authors of [27] are wrong.

3)In [44], the first name of the first author is also reported.  

The references have been corrected, and the complete list has been checked and corrected.

Indeed, I have been really interested in this manuscript so that I would like to provide the most interesting idea that popped in my head when I read it. Indeed, I am glad when the reviewing process comes close to a friendly discussion among colleagues. I hope that the authors can find my suggestion interesting.

Thinking about the characteristics of the proposed method, I came to the conclusion that Figure 6, Table II and Table III are not completely effective to synthesize the results. In particular, Figure 6 gives a limited contribution to understand the behavior of the proposed solution. For this reason, I would like to suggest considering a new table. I hope that the HYPOTETHICAL considerations reported about the table can be confirmed by the real data. In any case, they should be useful as a base for the discussion. If this new table and the corresponding discussion were really interesting, they could be included in the revised version of the manuscript. 

The decision fusion framework considers SunSAL and MLR. Applying a MAP classifier to the abundances and probabilities obtained by SunSAL and MLR respectively, the corresponding classifiers are defined. Therefore, for each class C_k of the classification map, for each pixel P(I,j), these three events can occur: 

•Event E1: both classifiers (SunSAL and MLR) mark P(i,j) as C_k.

•Event E2: only one classifier marks P(i,j) as C_k.

•Event E3: none classifier marks P(i,j) as C_k. 

 

For each C_k:

•S(C_k,E1) is the set of pixels of Event E1. 

•S(C_k,E2) is the set of pixels of Event E2.

•S(C_k,E3) is the set of pixels of Event E3.  

And

•n_(C_k,E1) is number of pixels in S(C_k,E1). 

•n_(C_k,E2) is number of pixels in S(C_k,E2).  

•n_(C_k,E3) is number of pixels in S(C_k,E3).  

The first column of the new table should report the labels of the classification (C1, C2, …, C10 in Figure 6) map.

The columns 2-4 should report n_(C_k,E1), n_(C_k,E2) and n_(C_k,E3) values. Indeed, column 4 is not particularly important.

 

The columns 5-7 should contain: 

•The number of pixels nT_(C_k,E1)  of S(C_k,E1) that are really C_k.

•The number of pixels nT_(C_k,E2)  of S(C_k,E2) that are really C_k .

•The number of pixels nT_(C_k,E3)  of S(C_k,E3) that are really C_k.

 

The columns 8-10 should contain:

•The number of pixels nFu_(C_k,E1) in (C_k,E1)  that the decision fusion framework marks C_k. 

•The number of pixels nFu_(C_k,E2) in (C_k,E2)  that the decision fusion framework marks C_k. 

•The number of pixels nFu_(C_k,E3) in (C_k,E3)  that the decision fusion framework marks C_k. 

 

The columns 11-13 should contain:

•The number of pixels nFuT_(C_k,E1) in (C_k,E1) that the decision fusion framework marks C_k  and that are really C_k. 

•The number of pixels nFuT_(C_k,E2) in (C_k,E2) that the decision fusion framework marks C_k  and that are really C_k. 

•The number of pixels nFuT_(C_k,E3) in (C_k,E3) that the decision fusion framework marks C_k and that are really C_k. 

 

Reasonable the following facts should be usually observed:

•n_(C_k,E1) should be greater n_(C_k,E2) because the two classifiers should be generally in accordance.

•nT_(C_k,E1) should be usually just smaller than n_(C_k,E1) . When both classifiers provide the same class, it should be the right class. Indeed, this fact should be all the more true as the abundance and the probability chosen are much greater than the other ones.

•nFu_(C_k,E1) should be more or less equal to n_(C_k,E1). When the classifiers provide the same class, the decision fusion framework should confirm the previous choice.

•nFuT(C_k,E1) should be more or less equal to nT_(C_k,E1). When the classifiers provide the same class, the decision fusion framework should confirm the previous choice and it should be the right choice.

•Since also E3 event is analyzed, some anomalies could be highlighted. In particular nFu_(C_k,E3) could not be zero.

The comparison between n_(C_k,E2), nFu_(C_k,E2) and nFuT_(C_k,E2) should be particularly interesting. When the two classifiers produce discordant results, in fact, the decision fusion framework should resolve the conflict taking the right decision.   

 

In my humble opinion, the proposed table should be help to understand the results obtained. For the sake of brevity, I take into consideration only University of Pavia dataset.

             Class        = [  1         2         3       4         5       6         7       8         9]

Figure6-TableA_i,i = [22.40 66.60 32.10 85.60 83.10 30.30 47.50 20.80 88.10];

                SunSAL = [33.04 68.95 60.59 84.61 95.43 46.80 48.80 36.74 98.61];

                      MLR = [50.72 67.17 70.93 88.95 97.81 55.85 80.75 61.60 95.52];

                    CRFL = [77.86 88.74 92.71 89.51 99.28 63.07 96.55 58.25 99.97];

•n_(C_2,E2) should be great and decision fusion framework produces relevant improvement.

•n_(C_4,E2) should be small so that decision fusion framework cannot work.

•The new table could play a key role to understand the results of k=8 class.

Further interesting consideration can be developed if the event E2 is divided in two different event: E2A when SunSal marks as C_k and E2B when MLR marks as C_k. Carrying out this analysis, it should be possible to assess which classifier plays a key-role in the decision fusion process. This analysis could be useful when the parameters are tuning.  

 

We appreciate the effort that the reviewer has taken to think along. As a matter of fact, we also have thought about other ways to present the experimental results. In the end, we decided to present the data in the standard way as is done in most of the literature, i.e. by listing classification accuracies and/or confusion matrices.

The method that is presented by the reviewer may indeed be an interesting way to validate the complementarity of both decision sources and the efficacy of their fusion.

There is however one important remark to make: as far as we can follow the reasoning of the reviewer,  the conclusions from the reviewer would be correct when the proposed fusion methods would only base their decisions on the decisions made by the 2 sources. This is however not the case: the MRFL and CRFL methods perform a spatial regularization along with the decision fusion, and thus the decisions also rely on the decisions made on the neighbors! Because of this, some of the conclusions of the reviewer are not correct anymore, e.g. nFu_(C_k,E3) is not just representing some anomalies, but represents a large fraction of the decisions.

Another remark is that all these experiments are performed for very low numbers of training samples (10 per class). The results of figure 6 are based on one such experiment, for illustration purposes, while the results of the tables are averages over 100 independent runs. Notice the large standard deviations, which are unavoidable in case of low number of training samples. If the suggestion of the reviewer is done on one experiment, the results may be not representative, while averaged over 100 experiments, the effects that are envisaged may average out.

We did perform the counting based on one experiment, and the reviewer can find the results below. The reviewer will see that the numbers follow some of his conclusions while others do not. Because of this, the suggestion of the reviewer for a class-specific analysis based on these numbers is also not foolproof.

Anyhow, we decided to not include this analysis in the manuscript, since it would require quite some space to properly explain, and much more analysis and discussion. A comparison would e.g. be required between the methods MRF_a, MRF_p and MRLF, as to investigate the effect of the spatial regularization, but even then, this regularization will be done in different ways by all these methods.

Pavia
Event counts (MRF):

class

E1

E2

E2a

E2b

E3

nT_E1

nT_E2

nT_E3

nF_E1

nF_E2

nF_E3

nFuT_E1

nFuT_E2

nFuT_E3

1

885

4733

1640

3093

36978

696

3017

1917

725

2910

1486

650

2538

1261

2

9971

10526

5544

4982

22099

8615

7255

2105

9717

8206

1295

8571

6843

851

3

1135

3840

1304

2536

37621

757

824

228

953

1350

436

750

773

163

4

3563

2253

1578

675

36780

1639

370

50

3483

848

76

1628

318

17

5

1033

315

27

288

41248

921

200

9

1032

268

9

921

199

5

6

2163

10585

5540

5045

29848

1356

2384

1170

1249

1782

315

1131

1472

213

7

958

3327

2314

1013

38311

502

517

166

582

625

204

502

516

150

8

842

4419

1068

3351

37335

505

1639

684

684

2313

833

470

1422

384

9

936

2222

2095

127

39438

547

37

3

863

199

153

547

37

3


Event counts (CRF):

class

E1

E2

E2a

E2b

E3

nT_E1

nT_E2

nT_E3

nF_E1

nF_E2

nF_E3

nFuT_E1

nFuT_E2

nFuT_E3

1

885

4733

1640

3093

36978

696

3017

1917

493

1708

1016

436

1473

860

2

9971

10526

5544

4982

22099

8615

7255

2105

9619

8123

1724

8451

6521

893

3

1135

3840

1304

2536

37621

757

824

228

926

1190

361

734

713

132

4

3563

2253

1578

675

36780

1639

370

50

3371

1056

274

1569

260

19

5

1033

315

27

288

41248

921

200

9

1021

255

12

915

191

6

6

2163

10585

5540

5045

29848

1356

2384

1170

1141

2165

874

1076

1355

195

7

958

3327

2314

1013

38311

502

517

166

711

1315

943

502

511

146

8

842

4419

1068

3351

37335

505

1639

684

430

1252

424

317

867

288

9

936

2222

2095

127

39438

547

37

3

904

602

686

547

37

3


Pines
Event counts (MRF):

class

E1

E2

E2a

E2b

E3

nT_E1

nT_E2

nT_E3

nF_E1

nF_E2

nF_E3

nFuT_E1

nFuT_E2

nFuT_E3

1

621

1505

950

555

7294

458

513

342

518

591

216

447

427

139

2

589

1635

677

958

7196

301

335

82

521

641

218

297

312

67

3

360

345

86

259

8715

321

87

19

324

84

5

319

72

2

4

532

241

96

145

8647

491

126

49

518

141

49

491

126

40

5

447

97

85

12

8876

432

8

0

447

12

2

432

8

0

6

390

930

598

332

8100

303

336

266

344

360

222

302

286

137

7

530

2178

687

1491

6712

458

1141

739

428

873

247

411

790

220

8

303

1219

794

425

7898

186

280

44

260

598

88

186

245

34

9

809

480

376

104

8131

757

339

71

798

363

66

754

322

59

10

154

740

336

404

8526

105

176

53

130

265

91

105

172

51

 

Event counts (CRF):

class

E1

E2

E2a

E2b

E3

nT_E1

nT_E2

nT_E3

nF_E1

nF_E2

nF_E3

nFuT_E1

nFuT_E2

nFuT_E3

1

621

1505

950

555

7294

458

513

342

527

615

241

456

461

167

2

589

1635

677

958

7196

301

335

82

440

524

155

301

313

77

3

360

345

86

259

8715

321

87

19

335

46

7

313

24

2

4

532

241

96

145

8647

491

126

49

523

133

46

491

117

38

5

447

97

85

12

8876

432

8

0

447

21

5

432

8

0

6

390

930

598

332

8100

303

336

266

357

372

217

303

291

146

7

530

2178

687

1491

6712

458

1141

739

449

996

345

430

877

298

8

303

1219

794

425

7898

186

280

44

259

578

63

186

255

35

9

809

480

376

104

8131

757

339

71

804

395

55

757

335

44

10

154

740

336

404

8526

105

176

53

121

270

74

101

148

26

 

Reviewer 2 Report

The paper is well organized. A well planned set of experiments were conducted to verify the efficacy of the proposed method. One question: why the results presented in this paper are different from those in your 2018 IGARSS conference paper?

Author Response

We thank the reviewers for their comments and suggestions for improvement. Below is a point-by-point reply to the reviewers. In the manuscript, all changes are denoted in red.

Reviewer 2

The paper is well organized. A well planned set of experiments were conducted to verify the efficacy of the proposed method. One question: why the results presented in this paper are different from those in your 2018 IGARSS conference paper?

The reviewer has correctly noticed differences of the order of a few percent with the results of the IGARSS 2018 paper. There are a number of reasons to explain the differences. First of all, we performed new experiments. Given the small training sizes, the standard deviations on the results are always quite large. Second, in the IGARSS paper, we used different unary potentials (given by –alpha and –p rather than –ln(alpha) and –ln (p). For some reason, they gave slightly better results, but it was hard to justify their use (other than referring to a paper that did the same). In this manuscript, we choose to use the unary terms properly, for both the proposed methods and the methods we compared with. As a result, all accuracies went down with a few percent. Finally, in the IGARSS paper, we used fixed values for the parameters lambda, beta and gamma, while in this manuscript, we performed a grid search.

 

 

Reviewer 3 Report

The work presented here is about supervised pixel classification in hyperspectral remote sensing under small training size scenarios (10 pixels per class). Authors propose decision fusion frameworks based on Markov and Conditional Random Fields with cross Links (MRFL and CRFL, respectively). These frameworks use two decision sources: (i) fractional abundances from sparse spectral unmixing (SunSAL), and (ii) probability outputs from a supervised classifier (Multinomial Logistic Regression - MLR), where both spatial and spectral information are employed. Additionally, these frameworks can be extended to a third decision source, being quite flexible.

Experiments use two well-known datasets: Indian Pines and Pavia University. The proposed MRFL and CRFL methods are compared to other approaches: SunSAL, MLR, Linear Combination (LC), a couple of MRFG variants from literature (MRFG_a and MRFG), and MRF/CRF applied to abundances or probabilities as a single source (MRF_a, MRF_p, CRF_a and CRF_p). Comparison is made in terms of Overall Accuracy (OA), Average Accuracy (AA), class-wise accuracy and kappa coefficient, including visualization maps.

Further analysis on (i) Beta and Alpha effects, (ii) slightly different sources employed and (iii) using three different decision sources, the original two plus a new one (probabilities derived from morphological features) are also included.

In general terms, I think this is a good paper, well written, well structured, and holding some merit. Before publication, some issues would need to be addressed:

1) It would be great to have some analysis or at least comments on the computational complexity of the algorithms. How complex are they? Resources needed? Execution time? In other words, what is the cost of improving the classification accuracy?

2) The authors made it quite clear during the abstract and introduction about using very limited training data. However, in the analysis, only one training size is used (10 samples per class). Does that mean the proposed methods work not so good in comparison with the rest under larger training sizes? If so, where are the boundaries for this? 20 samples per class? 50?

3) Lines 275-278: ‘One can clearly notice that there is 275 more confusion between SunSAL and the MLR classifier than between the MLR and SVM classifiers, 276 indicating that the abundances are more complementary to the MLR probabilities than the SVM class 277 probabilities.’ I’m not sure about the meaning of this sentence. Could you please re-write/explain?

4) Lines 315 and 332: Are these subsection captions? They are not correctly formatted.

5) Section 3 caption is all in capitals. It should be corrected.

Author Response

We thank the reviewers for their comments and suggestions for improvement. Below is a point-by-point reply to the reviewers. In the manuscript, all changes are denoted in red.

Reviewer 3

The work presented here is about supervised pixel classification in hyperspectral remote sensing under small training size scenarios (10 pixels per class). Authors propose decision fusion frameworks based on Markov and Conditional Random Fields with cross Links (MRFL and CRFL, respectively). These frameworks use two decision sources: (i) fractional abundances from sparse spectral unmixing (SunSAL), and (ii) probability outputs from a supervised classifier (Multinomial Logistic Regression - MLR), where both spatial and spectral information are employed. Additionally, these frameworks can be extended to a third decision source, being quite flexible.

Experiments use two well-known datasets: Indian Pines and Pavia University. The proposed MRFL and CRFL methods are compared to other approaches: SunSAL, MLR, Linear Combination (LC), a couple of MRFG variants from literature (MRFG_a and MRFG), and MRF/CRF applied to abundances or probabilities as a single source (MRF_a, MRF_p, CRF_a and CRF_p). Comparison is made in terms of Overall Accuracy (OA), Average Accuracy (AA), class-wise accuracy and kappa coefficient, including visualization maps.

Further analysis on (i) Beta and Alpha effects, (ii) slightly different sources employed and (iii) using three different decision sources, the original two plus a new one (probabilities derived from morphological features) are also included.

In general terms, I think this is a good paper, well written, well structured, and holding some merit. Before publication, some issues would need to be addressed:

It would be great to have some analysis or at least comments on the computational complexity of the algorithms. How complex are they? Resources needed? Execution time? In other words, what is the cost of improving the classification accuracy?

We have included the following paragraph on the computational complexity of the proposed algorithms:

“Our proposed method uses the graph-cut $\alpha$ - expansion algorithm [38-41], which has a worst case complexity of O(mn^2|P|) for a single optimization problem where m  denotes the number of edges, n denotes the number of nodes in the graph and |P| denotes the cost of the minimum cut. Thus, the theoretical computational complexity of our proposed method is: O(kCmn^2|P|), with k the upper bound of the number of iterations and  C the number of classes. With a non-cautious addition of edges in the graph, for instance adding a cross link between each node and all other nodes from the second source, there would be a quadratic increase in the computational complexity.

On the other hand, the empirical complexity in real scenarios has been shown to be between linear and quadratic w.r.t. the graph size \cite{empComplexity}.”

@ARTICLE{empComplexity,

author = {Yuri Boykov and Vladimir Kolmogorov},

title = {An Experimental Comparison of Min-Cut/Max-Flow Algorithms for Energy Minimization in Vision},

journal = {IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE},

year = {2001},

volume = {26},

pages = {359--374}}

 

 

We also included some words on the actual execution times of the experiments.

 

“The experiments were run on a PC with Intel i7-6700K and 32GB RAM. The execution time for one run with fixed parameters was in the order of a second for the MRFL and a minute for the CRFL. When performing grid search and averaging over 100 runs, we run the experiments on the UAntwerpen HPC (CalcUA Super-computing facility) having nodes with 128GB and 256GB RAM and 2.4GHz 14-core Broadwell CPUs, on which the different runs were distributed, leading to speedups with a factor of 10-50.”

2) The authors made it quite clear during the abstract and introduction about using very limited training data. However, in the analysis, only one training size is used (10 samples per class). Does that mean the proposed methods work not so good in comparison with the rest under larger training sizes? If so, where are the boundaries for this? 20 samples per class? 50?

This is a good question. We have been running experiments with increasing training sample sizes and we noticed that the differences between the different fusion methods became smaller. This indicates that the advantages of the proposed method level out for larger training sizes. We decided not to include these experiments since it would make the manuscript more complicated. However, in the new version, we have included this as a remark in the discussion.

3) Lines 275-278: ‘One can clearly notice that there is 275 more confusion between SunSAL and the MLR classifier than between the MLR and SVM classifiers, 276 indicating that the abundances are more complementary to the MLR probabilities than the SVM class 277 probabilities.’ I’m not sure about the meaning of this sentence. Could you please re-write/explain?

This part has been rewritten as:

“One can clearly notice that there is a higher spread in the confusion matrices of SunSAL versus MLR than in the ones of SVM versus MLR. This indicates that SunSAL and MLR disagree more than MLR and SVM do, and that the abundances provide more complementary information to the MLR probabilities than that the SVM class probabilities do. This makes the abundances a good candidate decision source in a decision fusion approach.”

4) Lines 315 and 332: Are these subsection captions? They are not correctly formatted.

This has been corrected.

5) Section 3 caption is all in capitals. It should be corrected.

This has been corrected.

 

Round 2

Reviewer 1 Report

Dear authors,

I was a little bit crestfallen that my suggestion was not included in the revised version of the manuscript. After spending a lot of time trying to improve a manuscript without being author, this result certainly is not particularly gratifying. 

Thinking about it, I am glad I tried. I had an idea that I considered interesting and, positively and generously, I proposed it to the authors. Therefore, I would like to thank the authors for carrying out the experimentations. I hope that they have found some interesting elements in my proposal, anyway.

Unfortunately, since the journal gives only three days for providing this review, I am obliged to postpone later the exam of the tables provided. 

The manuscript is certainly ready to be published. My only recommendation, left to the goodwill of the authors, is about including the reference about the routine plotConfMat developed by Vahe Tshitoyan [ref1] in the bibliography of the paper. In this way, the author of the routine will deserve due recognition and the manuscript will implicitly suggest how producing figures so effective. By the way, [ref1] is different from the link provided by the authors due to this last one did not work. 


[ref1] https://www.mathworks.com/matlabcentral/fileexchange/64185-plot-confusion-matrix


Reviewer 3 Report

I’m satisfied with the changes introduced by the authors and, therefore, I think the paper can be published if the editors agree.

Back to TopTop