Python to write multiple dataframes and highlight rows inside an excel fileReading columns and rows in a .csv...

What is the smallest molar volume?

What really causes series inductance of capacitors?

Protagonist constantly has to have long words explained to her. Will this get tedious?

Whats happened with already installed GNOME apps if I install and run KDE to Ubuntu 18.04?

Can a Hydra make multiple opportunity attacks at once?

What is the reward?

SQL Server Service does not start automatically after system restart

Is it possible to narrate a novel in a faux-historical style without alienating the reader?

Is there a configuration of the 8-puzzle where locking a tile makes it harder?

Taking an academic pseudonym?

Does しかたない imply disappointment?

Excluding or including by awk

UK visa start date and Flight Depature Time

typeof generic and casted type

What sort of grammatical construct is ‘Quod per sortem sternit fortem’?

How can guns be countered by melee combat without raw-ability or exceptional explanations?

Distribution of sum of independent exponentials with random number of summands

Sets which are both Sum-free and Product-free.

How do I make my single-minded character more interested in the main story?

What is formjacking?

Why do single electrical receptacles exist?

70s or 80s B-movie about aliens in a family's television, fry the house cat and trap the son inside the TV

What does @ mean in a hostname in DNS configuration?

What's the meaning of #0?



Python to write multiple dataframes and highlight rows inside an excel file


Reading columns and rows in a .csv fileExcel Mapping ModuleSplit excel file with multiple sheets, manipulate the data and create final out fileWrite binary save file in PythonSum over selected numpy.ndarray column and write to a filePython read/write pickled fileWrite millions of lines to a file - Python, Dataframes and RedisCopy Excel file and make multiple daily versionsFastest way to write large CSV file in pythonInvert rows and columns in an Excel file













3












$begingroup$


I am trying to write multiple dataframes into an excel file one after another with the same logic. Nothing changes for any of the data frame, except the number of columns or number of records. The functions are still the same.



For example,



writer = pd.ExcelWriter(OutputName)   
Emp_ID_df.to_excel(writer,'Sheet1',index = False)
Visa_df.to_excel(writer,'Sheet2',index = False)
custom_df_1.to_excel(writer,'Sheet3',index = False)
writer.save()


Now then, once written I am trying to highlight a row if any of the Boolean column has False value in it. There is no library in anacondas that I am aware of can highlight cells in excel. So I am going for the native one.



from win32com.client import Dispatch #to work with excel files
Pre_Out_df_ncol = Emp_ID_df.shape[1]
Pre_Out_df_nrow = Emp_ID_df.shape[0]
RequiredCol_let = colnum_num_string(Pre_Out_df_ncol)
arr = (Emp_ID_df.select_dtypes(include=[bool])).eq(False).any(axis=1).values
ReqRows = np.arange(1, len(Emp_ID_df)+ 1)[arr].tolist()

Pre_Out_df_ncol_2 = Visa_df.shape[1]
Pre_Out_df_nrow_2 = Visa_df.shape[0]
RequiredCol_let_2 = colnum_num_string(Pre_Out_df_ncol_2)
arr_2 = (Visa_df.select_dtypes(include=[bool])).eq(False).any(axis=1).values
ReqRows_2 = np.arange(1, len(Visa_df)+ 1)[arr_2].tolist()

Pre_Out_df_ncol_3 = custom_df_1.shape[1]
Pre_Out_df_nrow_3 = custom_df_1.shape[0]
RequiredCol_let_3 = colnum_num_string(Pre_Out_df_ncol_3)
arr_3 = (custom_df_1.select_dtypes(include=[bool])).eq(False).any(axis=1).values
ReqRows_3 = np.arange(1, len(custom_df_1)+ 1)[arr_3].tolist()

xlApp = Dispatch("Excel.Application")
xlwb1 = xlApp.Workbooks.Open(OutputName)
xlApp.visible = False
print ("n...Highlighting the Output File at " + datetime.now().strftime('%Y-%m-%d %H:%M:%S'))

for i in range(len(ReqRows)):
j = ReqRows[i] + 1
xlwb1.sheets('Sheet1').Range('A' + str(j) + ":" + RequiredCol_let + str(j)).Interior.ColorIndex = 6
xlwb1.sheets('Sheet1').Columns.AutoFit()


for i in range(len(ReqRows_2)):
j = ReqRows_2[i] + 1
xlwb1.sheets('Sheet2').Range('A' + str(j) + ":" + RequiredCol_let_2 + str(j)).Interior.ColorIndex = 6
xlwb1.sheets('Sheet2').Columns.AutoFit()


for i in range(len(ReqRows_3)):
j = ReqRows_3[i] + 1
xlwb1.sheets('Sheet3').Range('A' + str(j) + ":" + RequiredCol_let_3 + str(j)).Interior.ColorIndex = 6
xlwb1.sheets('Sheet3').Columns.AutoFit()


At last, I am changing the name of the sheet



xlwb1.Sheets("Sheet1").Name = "XXXXA"
xlwb1.Sheets("Sheet2").Name = "XXXXASDAD"
xlwb1.Sheets("Sheet3").Name = "SADAD"
xlwb1.Save()


Now there are a few problems here



1) My number of dataframe increases and which means I am writing up the same code again and again.



2) The highlighting process works but it is too slow. Sometimes 90 % of the rows needs to be highlighted. There are 1 million rows and doing them one after another takes 35 minutes.



Kindly help me with this.










share|improve this question









New contributor




Sid29 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.







$endgroup$












  • $begingroup$
    Did you have a look at xlsxwriter.readthedocs.io/example_pandas_conditional.html and xlsxwriter.readthedocs.io/working_with_conditional_formats.html ?
    $endgroup$
    – Graipher
    2 hours ago
















3












$begingroup$


I am trying to write multiple dataframes into an excel file one after another with the same logic. Nothing changes for any of the data frame, except the number of columns or number of records. The functions are still the same.



For example,



writer = pd.ExcelWriter(OutputName)   
Emp_ID_df.to_excel(writer,'Sheet1',index = False)
Visa_df.to_excel(writer,'Sheet2',index = False)
custom_df_1.to_excel(writer,'Sheet3',index = False)
writer.save()


Now then, once written I am trying to highlight a row if any of the Boolean column has False value in it. There is no library in anacondas that I am aware of can highlight cells in excel. So I am going for the native one.



from win32com.client import Dispatch #to work with excel files
Pre_Out_df_ncol = Emp_ID_df.shape[1]
Pre_Out_df_nrow = Emp_ID_df.shape[0]
RequiredCol_let = colnum_num_string(Pre_Out_df_ncol)
arr = (Emp_ID_df.select_dtypes(include=[bool])).eq(False).any(axis=1).values
ReqRows = np.arange(1, len(Emp_ID_df)+ 1)[arr].tolist()

Pre_Out_df_ncol_2 = Visa_df.shape[1]
Pre_Out_df_nrow_2 = Visa_df.shape[0]
RequiredCol_let_2 = colnum_num_string(Pre_Out_df_ncol_2)
arr_2 = (Visa_df.select_dtypes(include=[bool])).eq(False).any(axis=1).values
ReqRows_2 = np.arange(1, len(Visa_df)+ 1)[arr_2].tolist()

Pre_Out_df_ncol_3 = custom_df_1.shape[1]
Pre_Out_df_nrow_3 = custom_df_1.shape[0]
RequiredCol_let_3 = colnum_num_string(Pre_Out_df_ncol_3)
arr_3 = (custom_df_1.select_dtypes(include=[bool])).eq(False).any(axis=1).values
ReqRows_3 = np.arange(1, len(custom_df_1)+ 1)[arr_3].tolist()

xlApp = Dispatch("Excel.Application")
xlwb1 = xlApp.Workbooks.Open(OutputName)
xlApp.visible = False
print ("n...Highlighting the Output File at " + datetime.now().strftime('%Y-%m-%d %H:%M:%S'))

for i in range(len(ReqRows)):
j = ReqRows[i] + 1
xlwb1.sheets('Sheet1').Range('A' + str(j) + ":" + RequiredCol_let + str(j)).Interior.ColorIndex = 6
xlwb1.sheets('Sheet1').Columns.AutoFit()


for i in range(len(ReqRows_2)):
j = ReqRows_2[i] + 1
xlwb1.sheets('Sheet2').Range('A' + str(j) + ":" + RequiredCol_let_2 + str(j)).Interior.ColorIndex = 6
xlwb1.sheets('Sheet2').Columns.AutoFit()


for i in range(len(ReqRows_3)):
j = ReqRows_3[i] + 1
xlwb1.sheets('Sheet3').Range('A' + str(j) + ":" + RequiredCol_let_3 + str(j)).Interior.ColorIndex = 6
xlwb1.sheets('Sheet3').Columns.AutoFit()


At last, I am changing the name of the sheet



xlwb1.Sheets("Sheet1").Name = "XXXXA"
xlwb1.Sheets("Sheet2").Name = "XXXXASDAD"
xlwb1.Sheets("Sheet3").Name = "SADAD"
xlwb1.Save()


Now there are a few problems here



1) My number of dataframe increases and which means I am writing up the same code again and again.



2) The highlighting process works but it is too slow. Sometimes 90 % of the rows needs to be highlighted. There are 1 million rows and doing them one after another takes 35 minutes.



Kindly help me with this.










share|improve this question









New contributor




Sid29 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.







$endgroup$












  • $begingroup$
    Did you have a look at xlsxwriter.readthedocs.io/example_pandas_conditional.html and xlsxwriter.readthedocs.io/working_with_conditional_formats.html ?
    $endgroup$
    – Graipher
    2 hours ago














3












3








3





$begingroup$


I am trying to write multiple dataframes into an excel file one after another with the same logic. Nothing changes for any of the data frame, except the number of columns or number of records. The functions are still the same.



For example,



writer = pd.ExcelWriter(OutputName)   
Emp_ID_df.to_excel(writer,'Sheet1',index = False)
Visa_df.to_excel(writer,'Sheet2',index = False)
custom_df_1.to_excel(writer,'Sheet3',index = False)
writer.save()


Now then, once written I am trying to highlight a row if any of the Boolean column has False value in it. There is no library in anacondas that I am aware of can highlight cells in excel. So I am going for the native one.



from win32com.client import Dispatch #to work with excel files
Pre_Out_df_ncol = Emp_ID_df.shape[1]
Pre_Out_df_nrow = Emp_ID_df.shape[0]
RequiredCol_let = colnum_num_string(Pre_Out_df_ncol)
arr = (Emp_ID_df.select_dtypes(include=[bool])).eq(False).any(axis=1).values
ReqRows = np.arange(1, len(Emp_ID_df)+ 1)[arr].tolist()

Pre_Out_df_ncol_2 = Visa_df.shape[1]
Pre_Out_df_nrow_2 = Visa_df.shape[0]
RequiredCol_let_2 = colnum_num_string(Pre_Out_df_ncol_2)
arr_2 = (Visa_df.select_dtypes(include=[bool])).eq(False).any(axis=1).values
ReqRows_2 = np.arange(1, len(Visa_df)+ 1)[arr_2].tolist()

Pre_Out_df_ncol_3 = custom_df_1.shape[1]
Pre_Out_df_nrow_3 = custom_df_1.shape[0]
RequiredCol_let_3 = colnum_num_string(Pre_Out_df_ncol_3)
arr_3 = (custom_df_1.select_dtypes(include=[bool])).eq(False).any(axis=1).values
ReqRows_3 = np.arange(1, len(custom_df_1)+ 1)[arr_3].tolist()

xlApp = Dispatch("Excel.Application")
xlwb1 = xlApp.Workbooks.Open(OutputName)
xlApp.visible = False
print ("n...Highlighting the Output File at " + datetime.now().strftime('%Y-%m-%d %H:%M:%S'))

for i in range(len(ReqRows)):
j = ReqRows[i] + 1
xlwb1.sheets('Sheet1').Range('A' + str(j) + ":" + RequiredCol_let + str(j)).Interior.ColorIndex = 6
xlwb1.sheets('Sheet1').Columns.AutoFit()


for i in range(len(ReqRows_2)):
j = ReqRows_2[i] + 1
xlwb1.sheets('Sheet2').Range('A' + str(j) + ":" + RequiredCol_let_2 + str(j)).Interior.ColorIndex = 6
xlwb1.sheets('Sheet2').Columns.AutoFit()


for i in range(len(ReqRows_3)):
j = ReqRows_3[i] + 1
xlwb1.sheets('Sheet3').Range('A' + str(j) + ":" + RequiredCol_let_3 + str(j)).Interior.ColorIndex = 6
xlwb1.sheets('Sheet3').Columns.AutoFit()


At last, I am changing the name of the sheet



xlwb1.Sheets("Sheet1").Name = "XXXXA"
xlwb1.Sheets("Sheet2").Name = "XXXXASDAD"
xlwb1.Sheets("Sheet3").Name = "SADAD"
xlwb1.Save()


Now there are a few problems here



1) My number of dataframe increases and which means I am writing up the same code again and again.



2) The highlighting process works but it is too slow. Sometimes 90 % of the rows needs to be highlighted. There are 1 million rows and doing them one after another takes 35 minutes.



Kindly help me with this.










share|improve this question









New contributor




Sid29 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.







$endgroup$




I am trying to write multiple dataframes into an excel file one after another with the same logic. Nothing changes for any of the data frame, except the number of columns or number of records. The functions are still the same.



For example,



writer = pd.ExcelWriter(OutputName)   
Emp_ID_df.to_excel(writer,'Sheet1',index = False)
Visa_df.to_excel(writer,'Sheet2',index = False)
custom_df_1.to_excel(writer,'Sheet3',index = False)
writer.save()


Now then, once written I am trying to highlight a row if any of the Boolean column has False value in it. There is no library in anacondas that I am aware of can highlight cells in excel. So I am going for the native one.



from win32com.client import Dispatch #to work with excel files
Pre_Out_df_ncol = Emp_ID_df.shape[1]
Pre_Out_df_nrow = Emp_ID_df.shape[0]
RequiredCol_let = colnum_num_string(Pre_Out_df_ncol)
arr = (Emp_ID_df.select_dtypes(include=[bool])).eq(False).any(axis=1).values
ReqRows = np.arange(1, len(Emp_ID_df)+ 1)[arr].tolist()

Pre_Out_df_ncol_2 = Visa_df.shape[1]
Pre_Out_df_nrow_2 = Visa_df.shape[0]
RequiredCol_let_2 = colnum_num_string(Pre_Out_df_ncol_2)
arr_2 = (Visa_df.select_dtypes(include=[bool])).eq(False).any(axis=1).values
ReqRows_2 = np.arange(1, len(Visa_df)+ 1)[arr_2].tolist()

Pre_Out_df_ncol_3 = custom_df_1.shape[1]
Pre_Out_df_nrow_3 = custom_df_1.shape[0]
RequiredCol_let_3 = colnum_num_string(Pre_Out_df_ncol_3)
arr_3 = (custom_df_1.select_dtypes(include=[bool])).eq(False).any(axis=1).values
ReqRows_3 = np.arange(1, len(custom_df_1)+ 1)[arr_3].tolist()

xlApp = Dispatch("Excel.Application")
xlwb1 = xlApp.Workbooks.Open(OutputName)
xlApp.visible = False
print ("n...Highlighting the Output File at " + datetime.now().strftime('%Y-%m-%d %H:%M:%S'))

for i in range(len(ReqRows)):
j = ReqRows[i] + 1
xlwb1.sheets('Sheet1').Range('A' + str(j) + ":" + RequiredCol_let + str(j)).Interior.ColorIndex = 6
xlwb1.sheets('Sheet1').Columns.AutoFit()


for i in range(len(ReqRows_2)):
j = ReqRows_2[i] + 1
xlwb1.sheets('Sheet2').Range('A' + str(j) + ":" + RequiredCol_let_2 + str(j)).Interior.ColorIndex = 6
xlwb1.sheets('Sheet2').Columns.AutoFit()


for i in range(len(ReqRows_3)):
j = ReqRows_3[i] + 1
xlwb1.sheets('Sheet3').Range('A' + str(j) + ":" + RequiredCol_let_3 + str(j)).Interior.ColorIndex = 6
xlwb1.sheets('Sheet3').Columns.AutoFit()


At last, I am changing the name of the sheet



xlwb1.Sheets("Sheet1").Name = "XXXXA"
xlwb1.Sheets("Sheet2").Name = "XXXXASDAD"
xlwb1.Sheets("Sheet3").Name = "SADAD"
xlwb1.Save()


Now there are a few problems here



1) My number of dataframe increases and which means I am writing up the same code again and again.



2) The highlighting process works but it is too slow. Sometimes 90 % of the rows needs to be highlighted. There are 1 million rows and doing them one after another takes 35 minutes.



Kindly help me with this.







python python-3.x pandas






share|improve this question









New contributor




Sid29 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.











share|improve this question









New contributor




Sid29 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









share|improve this question




share|improve this question








edited 2 hours ago









Mathias Ettinger

24.7k33184




24.7k33184






New contributor




Sid29 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









asked 2 hours ago









Sid29Sid29

1184




1184




New contributor




Sid29 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.





New contributor





Sid29 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.






Sid29 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.












  • $begingroup$
    Did you have a look at xlsxwriter.readthedocs.io/example_pandas_conditional.html and xlsxwriter.readthedocs.io/working_with_conditional_formats.html ?
    $endgroup$
    – Graipher
    2 hours ago


















  • $begingroup$
    Did you have a look at xlsxwriter.readthedocs.io/example_pandas_conditional.html and xlsxwriter.readthedocs.io/working_with_conditional_formats.html ?
    $endgroup$
    – Graipher
    2 hours ago
















$begingroup$
Did you have a look at xlsxwriter.readthedocs.io/example_pandas_conditional.html and xlsxwriter.readthedocs.io/working_with_conditional_formats.html ?
$endgroup$
– Graipher
2 hours ago




$begingroup$
Did you have a look at xlsxwriter.readthedocs.io/example_pandas_conditional.html and xlsxwriter.readthedocs.io/working_with_conditional_formats.html ?
$endgroup$
– Graipher
2 hours ago










2 Answers
2






active

oldest

votes


















3












$begingroup$

First, starting from your code, you should realize that you are repeating yourself, three times. This goes against the principle Don't repeat Yourself (DRY).



The only real difference between processing your three sheets are their name and the underlying dataframe, so you could make this into two functions:



from win32com.client import Dispatch
import pandas as pd

def highlight_false(df):
arr = (df.select_dtypes(include=[bool])).eq(False).any(axis=1).values
return np.arange(1, len(df) + 1)[arr].tolist()


def color_rows(sheet, rows, col):
for row in rows:
cells = f"A{row+1}:{col}{row+1}"
sheet.Range(cells).Interior.ColorIndex = 6
sheet.Columns.AutoFit()


if __name__ == "__main__":
Emp_ID_df = ...
writer = pd.ExcelWriter(OutputName)
Emp_ID_df.to_excel(writer, 'Sheet1', index=False)

excel_app = Dispatch("Excel.Application")
workbook = excel_app.Workbooks.Open(OutputName)
excel_app.visible = False

sheet_names = ["Sheet1"]
dfs = [Emp_ID_df]
for sheet_name, df in zip(sheet_names, dfs):
sheet = workbook.Sheets(sheet)
rows = highlight_false(df)
col = colnum_num_string(df.shape[1])
color_rows(sheet, rows, col)


However, there is an even easier method using xlsxwriter:



import pandas as pd

Emp_ID_df = ...

writer = pd.ExcelWriter(OutputName, engine='xlsxwriter')
Emp_ID_df.to_excel(writer, sheet_name="Sheet1", index=False)

workbook = writer.book
format1 = workbook.add_format({'bg_color': '#FFC7CE',
'font_color': '#9C0006'})
dfs = [Emp_ID_df]
for df, sheet in zip(dfs, writer.sheets.values()):
nrow, ncol = df.shape
col_letter = colnum_num_string(ncol + 1)
cells = f"A1:{col_letter}{nrow+1}"
sheet.conditional_format(cells, {"type": "cell",
"criteria": "==",
"value": 0,
"format": format1})
writer.save()


You might have to ensure that the sheets do not get out of snyc from the data frames, or just keep track of what name you save each dataframe to.



In addition I used Python's official style-guide, PEP8, which recommends using lower_case for functions and variables as well as renaming your variables so they are a lot clearer.






share|improve this answer











$endgroup$





















    1












    $begingroup$


    1) My number of dataframe increases and which means I am writing up
    the same code again and again




    You're repeating a lot of code which is essentially doing the same thing with just a couple of variables.
    I think for this you should wrap it in a function.
    I've taken your code and put it inside a function:



    def highlight_false_cells(sheetName, dataFrame, OutputName):
    Pre_Out_df_ncol = dataFrame.shape[1]
    Pre_Out_df_nrow = dataFrame.shape[0] # Is this required? It doesn't look to be used.
    RequiredCol_let = colnum_num_string(Pre_Out_df_ncol)
    arr = (dataFrame.select_dtypes(include=[bool])).eq(False).any(axis=1).values
    ReqRows = np.arange(1, len(dataFrame) + 1)[arr].tolist()

    xlApp = Dispatch("Excel.Application")
    xlwb1 = xlApp.Workbooks.Open(OutputName)
    xlApp.visible = False
    print("n...Highlighting the Output File at " + datetime.now().strftime('%Y-%m-%d %H:%M:%S'))

    for i in range(len(ReqRows)):
    j = ReqRows[i] + 1
    xlwb1.sheets(sheetName).Range('A' + str(j) + ":" + RequiredCol_let + str(j)).Interior.ColorIndex = 6
    xlwb1.sheets(sheetName).Columns.AutoFit()

    xlwb1.Save()


    To call this for your dataframes:



    highlight_false_cells("XXXXA", Emp_ID_df, OutputName)
    highlight_false_cells("XXXXASDAD", Visa_df, OutputName)
    highlight_false_cells("SADAD", custom_df_1, OutputName)


    I'm not really familiar with the packages you're using, so there may be a mistake in logic within there. However hopefully it gives you a good example how to take your work and put it into a function.



    I'd also recommend looking into a programming principle called "DRY", which stands for "Don't Repeat Yourself". If you can learn to spot areas like this where you have a lot of repeated lines then it will make stuff easier I believe.






    share|improve this answer








    New contributor




    cphilip is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
    Check out our Code of Conduct.






    $endgroup$













      Your Answer





      StackExchange.ifUsing("editor", function () {
      return StackExchange.using("mathjaxEditing", function () {
      StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
      StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
      });
      });
      }, "mathjax-editing");

      StackExchange.ifUsing("editor", function () {
      StackExchange.using("externalEditor", function () {
      StackExchange.using("snippets", function () {
      StackExchange.snippets.init();
      });
      });
      }, "code-snippets");

      StackExchange.ready(function() {
      var channelOptions = {
      tags: "".split(" "),
      id: "196"
      };
      initTagRenderer("".split(" "), "".split(" "), channelOptions);

      StackExchange.using("externalEditor", function() {
      // Have to fire editor after snippets, if snippets enabled
      if (StackExchange.settings.snippets.snippetsEnabled) {
      StackExchange.using("snippets", function() {
      createEditor();
      });
      }
      else {
      createEditor();
      }
      });

      function createEditor() {
      StackExchange.prepareEditor({
      heartbeatType: 'answer',
      autoActivateHeartbeat: false,
      convertImagesToLinks: false,
      noModals: true,
      showLowRepImageUploadWarning: true,
      reputationToPostImages: null,
      bindNavPrevention: true,
      postfix: "",
      imageUploader: {
      brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
      contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
      allowUrls: true
      },
      onDemand: true,
      discardSelector: ".discard-answer"
      ,immediatelyShowMarkdownHelp:true
      });


      }
      });






      Sid29 is a new contributor. Be nice, and check out our Code of Conduct.










      draft saved

      draft discarded


















      StackExchange.ready(
      function () {
      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f214042%2fpython-to-write-multiple-dataframes-and-highlight-rows-inside-an-excel-file%23new-answer', 'question_page');
      }
      );

      Post as a guest















      Required, but never shown

























      2 Answers
      2






      active

      oldest

      votes








      2 Answers
      2






      active

      oldest

      votes









      active

      oldest

      votes






      active

      oldest

      votes









      3












      $begingroup$

      First, starting from your code, you should realize that you are repeating yourself, three times. This goes against the principle Don't repeat Yourself (DRY).



      The only real difference between processing your three sheets are their name and the underlying dataframe, so you could make this into two functions:



      from win32com.client import Dispatch
      import pandas as pd

      def highlight_false(df):
      arr = (df.select_dtypes(include=[bool])).eq(False).any(axis=1).values
      return np.arange(1, len(df) + 1)[arr].tolist()


      def color_rows(sheet, rows, col):
      for row in rows:
      cells = f"A{row+1}:{col}{row+1}"
      sheet.Range(cells).Interior.ColorIndex = 6
      sheet.Columns.AutoFit()


      if __name__ == "__main__":
      Emp_ID_df = ...
      writer = pd.ExcelWriter(OutputName)
      Emp_ID_df.to_excel(writer, 'Sheet1', index=False)

      excel_app = Dispatch("Excel.Application")
      workbook = excel_app.Workbooks.Open(OutputName)
      excel_app.visible = False

      sheet_names = ["Sheet1"]
      dfs = [Emp_ID_df]
      for sheet_name, df in zip(sheet_names, dfs):
      sheet = workbook.Sheets(sheet)
      rows = highlight_false(df)
      col = colnum_num_string(df.shape[1])
      color_rows(sheet, rows, col)


      However, there is an even easier method using xlsxwriter:



      import pandas as pd

      Emp_ID_df = ...

      writer = pd.ExcelWriter(OutputName, engine='xlsxwriter')
      Emp_ID_df.to_excel(writer, sheet_name="Sheet1", index=False)

      workbook = writer.book
      format1 = workbook.add_format({'bg_color': '#FFC7CE',
      'font_color': '#9C0006'})
      dfs = [Emp_ID_df]
      for df, sheet in zip(dfs, writer.sheets.values()):
      nrow, ncol = df.shape
      col_letter = colnum_num_string(ncol + 1)
      cells = f"A1:{col_letter}{nrow+1}"
      sheet.conditional_format(cells, {"type": "cell",
      "criteria": "==",
      "value": 0,
      "format": format1})
      writer.save()


      You might have to ensure that the sheets do not get out of snyc from the data frames, or just keep track of what name you save each dataframe to.



      In addition I used Python's official style-guide, PEP8, which recommends using lower_case for functions and variables as well as renaming your variables so they are a lot clearer.






      share|improve this answer











      $endgroup$


















        3












        $begingroup$

        First, starting from your code, you should realize that you are repeating yourself, three times. This goes against the principle Don't repeat Yourself (DRY).



        The only real difference between processing your three sheets are their name and the underlying dataframe, so you could make this into two functions:



        from win32com.client import Dispatch
        import pandas as pd

        def highlight_false(df):
        arr = (df.select_dtypes(include=[bool])).eq(False).any(axis=1).values
        return np.arange(1, len(df) + 1)[arr].tolist()


        def color_rows(sheet, rows, col):
        for row in rows:
        cells = f"A{row+1}:{col}{row+1}"
        sheet.Range(cells).Interior.ColorIndex = 6
        sheet.Columns.AutoFit()


        if __name__ == "__main__":
        Emp_ID_df = ...
        writer = pd.ExcelWriter(OutputName)
        Emp_ID_df.to_excel(writer, 'Sheet1', index=False)

        excel_app = Dispatch("Excel.Application")
        workbook = excel_app.Workbooks.Open(OutputName)
        excel_app.visible = False

        sheet_names = ["Sheet1"]
        dfs = [Emp_ID_df]
        for sheet_name, df in zip(sheet_names, dfs):
        sheet = workbook.Sheets(sheet)
        rows = highlight_false(df)
        col = colnum_num_string(df.shape[1])
        color_rows(sheet, rows, col)


        However, there is an even easier method using xlsxwriter:



        import pandas as pd

        Emp_ID_df = ...

        writer = pd.ExcelWriter(OutputName, engine='xlsxwriter')
        Emp_ID_df.to_excel(writer, sheet_name="Sheet1", index=False)

        workbook = writer.book
        format1 = workbook.add_format({'bg_color': '#FFC7CE',
        'font_color': '#9C0006'})
        dfs = [Emp_ID_df]
        for df, sheet in zip(dfs, writer.sheets.values()):
        nrow, ncol = df.shape
        col_letter = colnum_num_string(ncol + 1)
        cells = f"A1:{col_letter}{nrow+1}"
        sheet.conditional_format(cells, {"type": "cell",
        "criteria": "==",
        "value": 0,
        "format": format1})
        writer.save()


        You might have to ensure that the sheets do not get out of snyc from the data frames, or just keep track of what name you save each dataframe to.



        In addition I used Python's official style-guide, PEP8, which recommends using lower_case for functions and variables as well as renaming your variables so they are a lot clearer.






        share|improve this answer











        $endgroup$
















          3












          3








          3





          $begingroup$

          First, starting from your code, you should realize that you are repeating yourself, three times. This goes against the principle Don't repeat Yourself (DRY).



          The only real difference between processing your three sheets are their name and the underlying dataframe, so you could make this into two functions:



          from win32com.client import Dispatch
          import pandas as pd

          def highlight_false(df):
          arr = (df.select_dtypes(include=[bool])).eq(False).any(axis=1).values
          return np.arange(1, len(df) + 1)[arr].tolist()


          def color_rows(sheet, rows, col):
          for row in rows:
          cells = f"A{row+1}:{col}{row+1}"
          sheet.Range(cells).Interior.ColorIndex = 6
          sheet.Columns.AutoFit()


          if __name__ == "__main__":
          Emp_ID_df = ...
          writer = pd.ExcelWriter(OutputName)
          Emp_ID_df.to_excel(writer, 'Sheet1', index=False)

          excel_app = Dispatch("Excel.Application")
          workbook = excel_app.Workbooks.Open(OutputName)
          excel_app.visible = False

          sheet_names = ["Sheet1"]
          dfs = [Emp_ID_df]
          for sheet_name, df in zip(sheet_names, dfs):
          sheet = workbook.Sheets(sheet)
          rows = highlight_false(df)
          col = colnum_num_string(df.shape[1])
          color_rows(sheet, rows, col)


          However, there is an even easier method using xlsxwriter:



          import pandas as pd

          Emp_ID_df = ...

          writer = pd.ExcelWriter(OutputName, engine='xlsxwriter')
          Emp_ID_df.to_excel(writer, sheet_name="Sheet1", index=False)

          workbook = writer.book
          format1 = workbook.add_format({'bg_color': '#FFC7CE',
          'font_color': '#9C0006'})
          dfs = [Emp_ID_df]
          for df, sheet in zip(dfs, writer.sheets.values()):
          nrow, ncol = df.shape
          col_letter = colnum_num_string(ncol + 1)
          cells = f"A1:{col_letter}{nrow+1}"
          sheet.conditional_format(cells, {"type": "cell",
          "criteria": "==",
          "value": 0,
          "format": format1})
          writer.save()


          You might have to ensure that the sheets do not get out of snyc from the data frames, or just keep track of what name you save each dataframe to.



          In addition I used Python's official style-guide, PEP8, which recommends using lower_case for functions and variables as well as renaming your variables so they are a lot clearer.






          share|improve this answer











          $endgroup$



          First, starting from your code, you should realize that you are repeating yourself, three times. This goes against the principle Don't repeat Yourself (DRY).



          The only real difference between processing your three sheets are their name and the underlying dataframe, so you could make this into two functions:



          from win32com.client import Dispatch
          import pandas as pd

          def highlight_false(df):
          arr = (df.select_dtypes(include=[bool])).eq(False).any(axis=1).values
          return np.arange(1, len(df) + 1)[arr].tolist()


          def color_rows(sheet, rows, col):
          for row in rows:
          cells = f"A{row+1}:{col}{row+1}"
          sheet.Range(cells).Interior.ColorIndex = 6
          sheet.Columns.AutoFit()


          if __name__ == "__main__":
          Emp_ID_df = ...
          writer = pd.ExcelWriter(OutputName)
          Emp_ID_df.to_excel(writer, 'Sheet1', index=False)

          excel_app = Dispatch("Excel.Application")
          workbook = excel_app.Workbooks.Open(OutputName)
          excel_app.visible = False

          sheet_names = ["Sheet1"]
          dfs = [Emp_ID_df]
          for sheet_name, df in zip(sheet_names, dfs):
          sheet = workbook.Sheets(sheet)
          rows = highlight_false(df)
          col = colnum_num_string(df.shape[1])
          color_rows(sheet, rows, col)


          However, there is an even easier method using xlsxwriter:



          import pandas as pd

          Emp_ID_df = ...

          writer = pd.ExcelWriter(OutputName, engine='xlsxwriter')
          Emp_ID_df.to_excel(writer, sheet_name="Sheet1", index=False)

          workbook = writer.book
          format1 = workbook.add_format({'bg_color': '#FFC7CE',
          'font_color': '#9C0006'})
          dfs = [Emp_ID_df]
          for df, sheet in zip(dfs, writer.sheets.values()):
          nrow, ncol = df.shape
          col_letter = colnum_num_string(ncol + 1)
          cells = f"A1:{col_letter}{nrow+1}"
          sheet.conditional_format(cells, {"type": "cell",
          "criteria": "==",
          "value": 0,
          "format": format1})
          writer.save()


          You might have to ensure that the sheets do not get out of snyc from the data frames, or just keep track of what name you save each dataframe to.



          In addition I used Python's official style-guide, PEP8, which recommends using lower_case for functions and variables as well as renaming your variables so they are a lot clearer.







          share|improve this answer














          share|improve this answer



          share|improve this answer








          edited 1 hour ago

























          answered 1 hour ago









          GraipherGraipher

          24.9k53587




          24.9k53587

























              1












              $begingroup$


              1) My number of dataframe increases and which means I am writing up
              the same code again and again




              You're repeating a lot of code which is essentially doing the same thing with just a couple of variables.
              I think for this you should wrap it in a function.
              I've taken your code and put it inside a function:



              def highlight_false_cells(sheetName, dataFrame, OutputName):
              Pre_Out_df_ncol = dataFrame.shape[1]
              Pre_Out_df_nrow = dataFrame.shape[0] # Is this required? It doesn't look to be used.
              RequiredCol_let = colnum_num_string(Pre_Out_df_ncol)
              arr = (dataFrame.select_dtypes(include=[bool])).eq(False).any(axis=1).values
              ReqRows = np.arange(1, len(dataFrame) + 1)[arr].tolist()

              xlApp = Dispatch("Excel.Application")
              xlwb1 = xlApp.Workbooks.Open(OutputName)
              xlApp.visible = False
              print("n...Highlighting the Output File at " + datetime.now().strftime('%Y-%m-%d %H:%M:%S'))

              for i in range(len(ReqRows)):
              j = ReqRows[i] + 1
              xlwb1.sheets(sheetName).Range('A' + str(j) + ":" + RequiredCol_let + str(j)).Interior.ColorIndex = 6
              xlwb1.sheets(sheetName).Columns.AutoFit()

              xlwb1.Save()


              To call this for your dataframes:



              highlight_false_cells("XXXXA", Emp_ID_df, OutputName)
              highlight_false_cells("XXXXASDAD", Visa_df, OutputName)
              highlight_false_cells("SADAD", custom_df_1, OutputName)


              I'm not really familiar with the packages you're using, so there may be a mistake in logic within there. However hopefully it gives you a good example how to take your work and put it into a function.



              I'd also recommend looking into a programming principle called "DRY", which stands for "Don't Repeat Yourself". If you can learn to spot areas like this where you have a lot of repeated lines then it will make stuff easier I believe.






              share|improve this answer








              New contributor




              cphilip is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
              Check out our Code of Conduct.






              $endgroup$


















                1












                $begingroup$


                1) My number of dataframe increases and which means I am writing up
                the same code again and again




                You're repeating a lot of code which is essentially doing the same thing with just a couple of variables.
                I think for this you should wrap it in a function.
                I've taken your code and put it inside a function:



                def highlight_false_cells(sheetName, dataFrame, OutputName):
                Pre_Out_df_ncol = dataFrame.shape[1]
                Pre_Out_df_nrow = dataFrame.shape[0] # Is this required? It doesn't look to be used.
                RequiredCol_let = colnum_num_string(Pre_Out_df_ncol)
                arr = (dataFrame.select_dtypes(include=[bool])).eq(False).any(axis=1).values
                ReqRows = np.arange(1, len(dataFrame) + 1)[arr].tolist()

                xlApp = Dispatch("Excel.Application")
                xlwb1 = xlApp.Workbooks.Open(OutputName)
                xlApp.visible = False
                print("n...Highlighting the Output File at " + datetime.now().strftime('%Y-%m-%d %H:%M:%S'))

                for i in range(len(ReqRows)):
                j = ReqRows[i] + 1
                xlwb1.sheets(sheetName).Range('A' + str(j) + ":" + RequiredCol_let + str(j)).Interior.ColorIndex = 6
                xlwb1.sheets(sheetName).Columns.AutoFit()

                xlwb1.Save()


                To call this for your dataframes:



                highlight_false_cells("XXXXA", Emp_ID_df, OutputName)
                highlight_false_cells("XXXXASDAD", Visa_df, OutputName)
                highlight_false_cells("SADAD", custom_df_1, OutputName)


                I'm not really familiar with the packages you're using, so there may be a mistake in logic within there. However hopefully it gives you a good example how to take your work and put it into a function.



                I'd also recommend looking into a programming principle called "DRY", which stands for "Don't Repeat Yourself". If you can learn to spot areas like this where you have a lot of repeated lines then it will make stuff easier I believe.






                share|improve this answer








                New contributor




                cphilip is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                Check out our Code of Conduct.






                $endgroup$
















                  1












                  1








                  1





                  $begingroup$


                  1) My number of dataframe increases and which means I am writing up
                  the same code again and again




                  You're repeating a lot of code which is essentially doing the same thing with just a couple of variables.
                  I think for this you should wrap it in a function.
                  I've taken your code and put it inside a function:



                  def highlight_false_cells(sheetName, dataFrame, OutputName):
                  Pre_Out_df_ncol = dataFrame.shape[1]
                  Pre_Out_df_nrow = dataFrame.shape[0] # Is this required? It doesn't look to be used.
                  RequiredCol_let = colnum_num_string(Pre_Out_df_ncol)
                  arr = (dataFrame.select_dtypes(include=[bool])).eq(False).any(axis=1).values
                  ReqRows = np.arange(1, len(dataFrame) + 1)[arr].tolist()

                  xlApp = Dispatch("Excel.Application")
                  xlwb1 = xlApp.Workbooks.Open(OutputName)
                  xlApp.visible = False
                  print("n...Highlighting the Output File at " + datetime.now().strftime('%Y-%m-%d %H:%M:%S'))

                  for i in range(len(ReqRows)):
                  j = ReqRows[i] + 1
                  xlwb1.sheets(sheetName).Range('A' + str(j) + ":" + RequiredCol_let + str(j)).Interior.ColorIndex = 6
                  xlwb1.sheets(sheetName).Columns.AutoFit()

                  xlwb1.Save()


                  To call this for your dataframes:



                  highlight_false_cells("XXXXA", Emp_ID_df, OutputName)
                  highlight_false_cells("XXXXASDAD", Visa_df, OutputName)
                  highlight_false_cells("SADAD", custom_df_1, OutputName)


                  I'm not really familiar with the packages you're using, so there may be a mistake in logic within there. However hopefully it gives you a good example how to take your work and put it into a function.



                  I'd also recommend looking into a programming principle called "DRY", which stands for "Don't Repeat Yourself". If you can learn to spot areas like this where you have a lot of repeated lines then it will make stuff easier I believe.






                  share|improve this answer








                  New contributor




                  cphilip is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                  Check out our Code of Conduct.






                  $endgroup$




                  1) My number of dataframe increases and which means I am writing up
                  the same code again and again




                  You're repeating a lot of code which is essentially doing the same thing with just a couple of variables.
                  I think for this you should wrap it in a function.
                  I've taken your code and put it inside a function:



                  def highlight_false_cells(sheetName, dataFrame, OutputName):
                  Pre_Out_df_ncol = dataFrame.shape[1]
                  Pre_Out_df_nrow = dataFrame.shape[0] # Is this required? It doesn't look to be used.
                  RequiredCol_let = colnum_num_string(Pre_Out_df_ncol)
                  arr = (dataFrame.select_dtypes(include=[bool])).eq(False).any(axis=1).values
                  ReqRows = np.arange(1, len(dataFrame) + 1)[arr].tolist()

                  xlApp = Dispatch("Excel.Application")
                  xlwb1 = xlApp.Workbooks.Open(OutputName)
                  xlApp.visible = False
                  print("n...Highlighting the Output File at " + datetime.now().strftime('%Y-%m-%d %H:%M:%S'))

                  for i in range(len(ReqRows)):
                  j = ReqRows[i] + 1
                  xlwb1.sheets(sheetName).Range('A' + str(j) + ":" + RequiredCol_let + str(j)).Interior.ColorIndex = 6
                  xlwb1.sheets(sheetName).Columns.AutoFit()

                  xlwb1.Save()


                  To call this for your dataframes:



                  highlight_false_cells("XXXXA", Emp_ID_df, OutputName)
                  highlight_false_cells("XXXXASDAD", Visa_df, OutputName)
                  highlight_false_cells("SADAD", custom_df_1, OutputName)


                  I'm not really familiar with the packages you're using, so there may be a mistake in logic within there. However hopefully it gives you a good example how to take your work and put it into a function.



                  I'd also recommend looking into a programming principle called "DRY", which stands for "Don't Repeat Yourself". If you can learn to spot areas like this where you have a lot of repeated lines then it will make stuff easier I believe.







                  share|improve this answer








                  New contributor




                  cphilip is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                  Check out our Code of Conduct.









                  share|improve this answer



                  share|improve this answer






                  New contributor




                  cphilip is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                  Check out our Code of Conduct.









                  answered 1 hour ago









                  cphilipcphilip

                  262




                  262




                  New contributor




                  cphilip is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                  Check out our Code of Conduct.





                  New contributor





                  cphilip is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                  Check out our Code of Conduct.






                  cphilip is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                  Check out our Code of Conduct.






















                      Sid29 is a new contributor. Be nice, and check out our Code of Conduct.










                      draft saved

                      draft discarded


















                      Sid29 is a new contributor. Be nice, and check out our Code of Conduct.













                      Sid29 is a new contributor. Be nice, and check out our Code of Conduct.












                      Sid29 is a new contributor. Be nice, and check out our Code of Conduct.
















                      Thanks for contributing an answer to Code Review Stack Exchange!


                      • Please be sure to answer the question. Provide details and share your research!

                      But avoid



                      • Asking for help, clarification, or responding to other answers.

                      • Making statements based on opinion; back them up with references or personal experience.


                      Use MathJax to format equations. MathJax reference.


                      To learn more, see our tips on writing great answers.




                      draft saved


                      draft discarded














                      StackExchange.ready(
                      function () {
                      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f214042%2fpython-to-write-multiple-dataframes-and-highlight-rows-inside-an-excel-file%23new-answer', 'question_page');
                      }
                      );

                      Post as a guest















                      Required, but never shown





















































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown

































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown







                      Popular posts from this blog

                      Szabolcs (Ungheria) Altri progetti | Menu di navigazione48°10′14.56″N 21°29′33.14″E /...

                      Discografia di Klaus Schulze Indice Album in studio | Album dal vivo | Singoli | Antologie | Colonne...

                      How to make inet_server_addr() return localhost in spite of ::1/128RETURN NEXT in Postgres FunctionConnect to...